Querying Data
Execute governed analytics and advanced SQL through Semaphor MCP.
Semaphor MCP provides two query tools:
semaphor_analyzefor governed semantic analytics.semaphor_query_sql_advancedas the raw-SQL escape hatch when the semantic tool cannot express the analysis.
Start every data question with semaphor_get_analysis_context. If the response recommends the semantic path, discover domains and datasets first, then use semaphor_analyze.
Governed Analytics
semaphor_analyze is the default tool for ordinary BI questions: metric by dimension, top-N, time series, filters, latest-available windows, period comparisons, period-change driver analysis, and relationship-aware joins.
Best for: Business analytics that should use curated semantic domains, governed joins, modeled metric aggregates, date-window resolution, and typed execution diagnostics.
Key characteristics:
- Uses semantic domain datasets and source-bearing field references.
- Respects security policies.
- Auto-resolves safe joins through semantic relationships.
- Returns generated SQL, records, column metadata, relationship diagnostics, and a typed execution result.
- Uses each semantic metric's authored
aggregateby default. - Allows a query-specific metric aggregate override when the user asks for a different rollup.
{
"domainId": "dom_abc",
"datasetName": "fact_purchase_line",
"metrics": [
{ "datasetName": "fact_purchase_line", "name": "net_purchase_value" }
],
"primaryMetric": {
"datasetName": "fact_purchase_line",
"name": "net_purchase_value"
},
"dimensions": [
{ "datasetName": "dim_supplier", "name": "supplier_name" }
],
"limit": 10,
"response_format": "json"
}If net_purchase_value is modeled with aggregate: "SUM", Semaphor generates a summed top-N query. The semantic model owns that default.
Aggregate Overrides
Sometimes the same metric can be analyzed with a different aggregate for one question. For example, total margin and average margin are both valid questions.
{
"domainId": "dom_abc",
"datasetName": "fact_sales_line",
"metrics": [
{
"datasetName": "fact_sales_line",
"name": "gross_margin_value",
"aggregate": "AVG"
}
],
"primaryMetric": {
"datasetName": "fact_sales_line",
"name": "gross_margin_value",
"aggregate": "AVG"
},
"dimensions": [
{ "datasetName": "dim_material", "name": "material_family" }
],
"limit": 10,
"response_format": "json"
}The override applies only to this MCP call. The response marks the override in execution metadata so clients can audit that the query used AVG instead of the semantic default.
Omit aggregate to use the semantic model. Add aggregate only when the user explicitly asks for a different rollup such as average, maximum, or distinct count.
Advanced SQL
semaphor_query_sql_advanced executes raw SQL. Use it only when the request is SQL-first or cannot be represented with semaphor_analyze.
Best for: Custom CTEs, window functions, unsupported semantic relationships, fields not exposed in the semantic domain, or exploratory SQL debugging.
Key characteristics:
- Pass a SQL string directly.
- Respects security policies.
- Optional Python post-processing is available when supported by the client/tool configuration.
- Requires explicit connection context.
- Does not provide semantic relationship diagnostics or typed analytic intent coverage.
{
"connectionId": "conn_abc123",
"sql": "WITH monthly AS (SELECT date_trunc('month', order_date) AS month, SUM(total_amount) AS revenue FROM orders GROUP BY 1) SELECT month, revenue, revenue - LAG(revenue) OVER (ORDER BY month) AS delta FROM monthly ORDER BY month"
}Choosing Between Tools
semaphor_analyze | semaphor_query_sql_advanced | |
|---|---|---|
| Default choice | Yes | No |
| Query format | Semantic fields and analytics intent | Raw SQL |
| Semantic joins | Automatic and governed | Manual |
| Metric aggregates | From semantic model, with explicit override support | Whatever SQL says |
| Diagnostics | Relationship, fan-out, validation, date-window, and execution metadata | SQL execution result |
| Best for | Governed BI questions | Advanced SQL-first analysis |
Response Format
Both query tools accept response_format:
"json"-- machine-readable payload with records, metadata, diagnostics, and generated SQL where available."markdown"-- human-readable summary for conversational answers.
Use "json" when building programmatic integrations or evals. Use "markdown" when the AI is presenting results directly to a user.
Tips
- Prefer semantic discovery first -- Use
semaphor_get_analysis_context, thensemaphor_list_semantic_domains,semaphor_list_datasets, andsemaphor_get_dataset_schema. - Use source-bearing refs -- When selecting fields from joined datasets, include
datasetNameon each field ref. - Let the semantic model choose aggregates -- Omit
aggregateunless the question asks for a different rollup. - Use SQL as an escape hatch -- Raw SQL is valid, but it should not be the first choice for normal governed analytics.