Skip to content

[WIP] Add aggregate_records DML tool to MCP server#3180

Closed
Copilot wants to merge 1 commit intomainfrom
copilot/add-aggregate-records-tool-again
Closed

[WIP] Add aggregate_records DML tool to MCP server#3180
Copilot wants to merge 1 commit intomainfrom
copilot/add-aggregate-records-tool-again

Conversation

Copy link
Contributor

Copilot AI commented Feb 28, 2026

Thanks for assigning this issue to me. I'm starting to work on it and will keep this PR's description up to date as I form a plan and make progress.

Original prompt

This section details on the original issue you should resolve

<issue_title>[Enh]: add aggregate_records DML tool to MCP server</issue_title>
<issue_description>## What?

Allow models to answer: "How many products are there?" and "What is our most expensive product?"

Why?

These are among the most common information discovery questions, a primary model use case.

How?

Introduce a new tool: aggregate_records that reuses native GraphQL aggregation capabilities in DAB.

Schema

{
  "type": "object",
  "properties": {
    "entity": {
      "type": "string",
      "description": "Entity name with READ permission.",
      "required": true
    },
    "function": {
      "type": "string",
      "enum": ["count", "avg", "sum", "min", "max"],
      "description": "Aggregation function to apply.",
      "required": true
    },
    "field": {
      "type": "string",
      "description": "Field to aggregate. Use '*' for count.",
      "required": true
    },
    "distinct": {
      "type": "boolean",
      "description": "Apply DISTINCT before aggregating.",
      "default": false
    },
    "filter": {
      "type": "string",
      "description": "OData filter applied before aggregating (WHERE). Example: 'unitPrice lt 10'",
      "default": ""
    },
    "groupby": {
      "type": "array",
      "items": { "type": "string" },
      "description": "Fields to group by, e.g., ['category', 'region']. Grouped field values are included in the response.",
      "default": []
    },
    "orderby": {
      "type": "string",
      "enum": ["asc", "desc"],
      "description": "Sort aggregated results by the computed value. Only applies with groupby.",
      "default": "desc"
    },
    "having": {
      "type": "object",
      "description": "Filter applied after aggregating on the result (HAVING). Operators are AND-ed together.",
      "properties": {
        "eq":  { "type": "number", "description": "Aggregated value equals." },
        "neq": { "type": "number", "description": "Aggregated value not equals." },
        "gt":  { "type": "number", "description": "Aggregated value greater than." },
        "gte": { "type": "number", "description": "Aggregated value greater than or equal." },
        "lt":  { "type": "number", "description": "Aggregated value less than." },
        "lte": { "type": "number", "description": "Aggregated value less than or equal." },
        "in":  {
          "type": "array",
          "items": { "type": "number" },
          "description": "Aggregated value is in the given list."
        }
      }
    }
  },
  "required": ["entity", "function", "field"]
}

Response Alias Convention

The aggregated value in the response is always aliased as {function}_{field}. For count with "*", the alias is count.

Examples

Q1: "How many products are there?"

{
  "entity": "Product",
  "function": "count",
  "field": "*"
}
SELECT COUNT(*) AS count
FROM Product;

Example output:

count
77

Q2: "What is the average price of products under $10?"

{
  "entity": "Product",
  "function": "avg",
  "field": "unitPrice",
  "filter": "unitPrice lt 10"
}
SELECT AVG(unitPrice) AS avg_unitPrice
FROM Product
WHERE unitPrice < 10;

Example output:

avg_unitPrice
6.74

Q3: "Which categories have more than 20 products?"

{
  "entity": "Product",
  "function": "count",
  "field": "*",
  "groupby": ["categoryName"],
  "having": {
    "gt": 20
  }
}
SELECT categoryName, COUNT(*) AS count
FROM Product
GROUP BY categoryName
HAVING COUNT(*) > 20;

Example output:

categoryName count
Beverages 24
Condiments 22

Q4: "For discontinued products, which categories have a total revenue between $500 and $10,000?"

{
  "entity": "Product",
  "function": "sum",
  "field": "unitPrice",
  "filter": "discontinued eq true",
  "groupby": ["categoryName"],
  "having": {
    "gte": 500,
    "lte": 10000
  }
}
SELECT categoryName, SUM(unitPrice) AS sum_unitPrice
FROM Product
WHERE discontinued = 1
GROUP BY categoryName
HAVING SUM(unitPrice) >= 500
   AND SUM(unitPrice) <= 10000;

Example output:

categoryName sum_unitPrice
Seafood 1834.50
Produce 742.00

Q5: "How many distinct suppliers do we have?"

{
  "entity": "Product",
  "function": "count",
  "field": "supplierId",
  "distinct": true
}
SELECT COUNT(DISTINCT supplierId) AS count_supplierId
FROM Product;

Example output:

count_supplierId
29

Q6: "Which categories have exactly 5 or 10 products?"

{
  "entity": "Product",
  "function": "count",
  "field": "*",
  "groupby": ["categoryName"],
  "having": {
    "in": [5, 10]
  }
}
SELECT c...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes Azure/data-api-builder#3178

<!-- START COPILOT CODING AGENT TIPS -->
---

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants