Microsoft Foundry Rerank MCP connector for Power Platform
April 03, 2026
Keyword search returns results that match terms. Semantic reranking returns results that match meaning. The gap between those two is where RAG pipelines lose quality—an LLM generates a mediocre answer because the most relevant documents were buried behind keyword-matched noise.
This connector adds Cohere Rerank v4 to Power Platform. Pass a query and a list of documents from any source—SharePoint, Dataverse, Azure AI Search, a custom API—and get them back ordered by semantic relevance. A second operation filters out everything below a score threshold so only high-quality context reaches your LLM.
Two MCP tools for Copilot Studio. Two REST operations for Power Automate and Power Apps. Supports 14 languages.
Full source: GitHub repository
How reranking works
Traditional search (BM25, keyword matching) retrieves documents that contain the right words. Semantic reranking goes further—it scores how well each document answers the query, regardless of exact word overlap.
The Cohere Rerank v4 model reads each document alongside the query and produces a relevance score between 0 and 1. A document about “quarterly revenue growth” scores high against the query “How did the company perform financially last quarter?” even if those exact words never appear together.
This matters most in RAG pipelines:
- Retrieve — Pull 50-100 candidate documents from search
- Rerank — Score each document’s relevance to the actual question
- Filter — Keep only documents above a quality threshold
- Generate — Send the top results as context to an LLM
Without reranking, the LLM sees whatever keyword search returned first. With reranking, it sees what actually answers the question.
Available models
| Model | Best for | Latency | Quality |
|---|---|---|---|
| Cohere-rerank-v4.0-pro | Maximum relevance accuracy | Higher | Best |
| Cohere-rerank-v4.0-fast | High-throughput, latency-sensitive flows | Lower | Good |
Both models support 14 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Chinese, Arabic, Vietnamese, Hindi, Russian, Indonesian, and Dutch.
Tools
MCP tools for Copilot Studio
| Tool | Description |
|---|---|
rerank_documents |
Rerank documents by relevance score |
rerank_and_filter |
Rerank and filter out documents below a score threshold |
How it works
User: "What's our return policy for international orders?"
1. Agent retrieves 20 documents from SharePoint search
2. Agent calls rerank_documents({
query: "return policy international orders",
documents: ["...", "...", ...],
top_n: 5
})
→ Returns 5 most relevant documents ordered by score:
[0.92] International Returns & Exchange Policy
[0.87] Cross-Border Shipping and Returns FAQ
[0.71] Customer Service Procedures Manual
[0.54] General Terms and Conditions
[0.41] Warehouse Operations Guide
3. Agent uses the top-scoring documents as context
to generate an accurate answer
Rerank and filter
The rerank_and_filter tool adds a min_score threshold. Documents below the threshold are excluded entirely.
Agent calls rerank_and_filter({
query: "return policy international orders",
documents: ["...", "...", ...],
min_score: 0.6
})
→ Returns only documents above 0.6:
total_input: 20
total_passed: 3
total_filtered: 17
results: [0.92, 0.87, 0.71]
This prevents low-relevance documents from polluting the LLM’s context window, reducing hallucinations.
REST operations for Power Automate and Power Apps
| Operation | Operation ID | Method | Path |
|---|---|---|---|
| Rerank Documents | RerankDocuments |
POST | /v2/rerank |
| Rerank and Filter | RerankAndFilter |
POST | /rerank/filter |
Parameter reference
| Operation | Parameter | Type | Default | Required |
|---|---|---|---|---|
| Rerank Documents | query |
string | — | Yes |
| Rerank Documents | documents |
string[] | — | Yes |
| Rerank Documents | model |
enum | Cohere-rerank-v4.0-pro | No |
| Rerank Documents | top_n |
int | all | No |
| Rerank Documents | max_tokens_per_doc |
int | 4096 | No |
| Rerank and Filter | query |
string | — | Yes |
| Rerank and Filter | documents |
string[] | — | Yes |
| Rerank and Filter | min_score |
float (0-1) | — | Yes |
| Rerank and Filter | model |
enum | Cohere-rerank-v4.0-pro | No |
| Rerank and Filter | top_n |
int | all passing | No |
| Rerank and Filter | max_tokens_per_doc |
int | 4096 | No |
Rerank Documents response
{
"results": [
{ "index": 3, "relevance_score": 0.92, "document": "..." },
{ "index": 7, "relevance_score": 0.87, "document": "..." },
{ "index": 1, "relevance_score": 0.71, "document": "..." }
],
"id": "request-id",
"meta": {
"api_version": { "version": "2" },
"billed_units": { "search_units": 1 }
}
}
The index field references the original position in the input array—use it to correlate reranked results back to your source data.
Rerank and Filter response
{
"results": [
{ "index": 3, "relevance_score": 0.92, "document": "..." },
{ "index": 7, "relevance_score": 0.87, "document": "..." }
],
"total_input": 20,
"total_passed": 2,
"total_filtered": 18
}
The total_passed and total_filtered counts let you monitor filter effectiveness. If total_passed is consistently zero, lower the min_score. If total_filtered is consistently zero, raise it.
Power Automate RAG pipeline example
Build a complete RAG pipeline in Power Automate:
- Search — Query SharePoint or Dataverse for documents matching the user’s question
- Collect — Extract text content from each result into an array of strings
- Rerank and Filter — Pass the text array + user question to the
RerankAndFilteroperation withmin_score: 0.5 - Generate — Send only the high-relevance documents as context to a chat completion connector (Microsoft Foundry, Azure OpenAI, or any LLM connector)
This pipeline reduces hallucinations by ensuring the LLM only sees documents that semantically match the question, not just keyword-matched results.
Pricing
Cohere Rerank is billed per search unit. One search unit equals one query with up to 100 documents. If you send 250 documents with one query, that’s 3 search units.
Documents longer than 4,096 tokens (including the query) are split into chunks internally. Each chunk counts as a separate document toward billing.
Prerequisites
- An Azure subscription with access to Microsoft Foundry
- Deploy Cohere-rerank-v4.0-pro or Cohere-rerank-v4.0-fast from the Foundry Model Catalog
- Note the Resource Name and API Key from the deployment
Setting up the connector
1. Deploy a Cohere Rerank model
- Go to the Foundry Model Catalog
- Select Cohere-rerank-v4.0-pro (best quality) or Cohere-rerank-v4.0-fast (lower latency)
- Select Deploy and choose your Azure AI Services resource
- Copy the Resource Name and API Key
2. Create the custom connector
- Go to Power Platform Maker Portal
- Navigate to Custom connectors > + New custom connector > Import an OpenAPI file
- Upload
apiDefinition.swagger.json - On the Security tab:
- Authentication type: API Key
- Parameter label: API Key
- Parameter name:
api-key - Parameter location: Header
- On the Code tab:
- Enable Code
- Upload
script.csx
- Select Create connector
3. Create a connection
- Select Test > + New connection
- Enter your Resource Name and API Key
- Select Create connection
4. Test the connector
Test RerankDocuments with a sample query and documents:
{
"query": "What is the company vacation policy?",
"documents": [
"Employees receive 15 days of paid time off per year.",
"The office dress code is business casual.",
"Vacation requests must be submitted two weeks in advance.",
"The company was founded in 2015."
]
}
Verify the response returns documents ordered by relevance, with the vacation-related documents scoring highest.
5. Add to Copilot Studio
- In Copilot Studio, open your agent
- Add this connector as an action—Copilot Studio detects the MCP endpoint via
x-ms-agentic-protocol - The agent can use
rerank_documentsorrerank_and_filterto improve search quality before answering
Known limitations
- Maximum recommended 1,000 documents per request
- Long documents are automatically truncated to
max_tokens_per_doc(default 4,096 tokens) - The model reranks by text similarity—it doesn’t understand document structure like tables or images
- Relevance scores are relative within a single request—scores aren’t comparable across different queries
- The filter operation runs in the connector script layer after the rerank API call, not in the model itself
Files
| File | Purpose |
|---|---|
apiDefinition.swagger.json |
OpenAPI 2.0 definition with MCP endpoint and 2 REST operations |
apiProperties.json |
API Key auth config and script operation bindings |
script.csx |
C# script handling MCP protocol, rerank API calls, and score filtering |
readme.md |
Setup and usage documentation |