Semaphor
MCP

Discovery Workflow

How AI assistants discover and navigate your Semaphor data using semantic-first discovery.

Before querying data, your AI assistant needs to discover what's available. Semaphor uses a semantic-first discovery approach that prioritizes business-friendly data models over raw database schemas.

The Discovery Flow

Every data question should start with semaphor_get_analysis_context, which determines the best discovery path:

1. Get Analysis Context

Call semaphor_get_analysis_context first. It returns:

  • recommendedPath — Either "semantic" or "physical"
  • recommendedNextTool — The tool to call next
  • semanticDomains — Available domains (if any)
  • fallbackConnections — Database connections (if no domains)

2a. Semantic Path (Preferred)

When semantic domains exist, the AI follows this sequence:

  1. semaphor_list_semantic_domains — Browse available domains
  2. semaphor_list_datasets — Discover datasets within a domain
  3. semaphor_get_dataset_schema — Get columns and types for a specific dataset
  4. semaphor_get_domain_relationships — Understand joins between datasets (optional, useful for multi-dataset queries)

2b. Physical Path (Fallback)

When no semantic domains are configured:

  1. semaphor_list_connections — List database connections
  2. semaphor_get_dataset_schema — Get columns using connection coordinates (connectionId, tableName, optionally databaseName and schemaName)

Semantic vs Physical Discovery

AspectSemantic PathPhysical Path
Discovery toolslist_semantic_domains then list_datasetslist_connections
Schema toolget_dataset_schema with domainId + datasetNameget_dataset_schema with connectionId + tableName
Join informationget_domain_relationships (automatic)Manual (you write the JOINs)
Field labelsBusiness-friendly names and descriptionsRaw column names
Best forBusiness questions and governed analyticsAd-hoc SQL analysis

When to Use Each

Use the semantic path when:

  • The project has semantic domains configured
  • You're answering business questions ("How many orders last month?")
  • You need auto-join relationships between datasets
  • You want business-friendly field names and descriptions

Use the physical path when:

  • No semantic domains exist in the project
  • You need to query a table not represented in any domain
  • You're doing advanced SQL analysis requiring direct table access

Example: Full Discovery Session

Discovery walkthrough
You: "What data can I explore?"

AI calls: semaphor_get_analysis_context
Response: {
  "recommendedPath": "semantic",
  "recommendedNextTool": "semaphor_list_semantic_domains",
  "semanticDomains": [
    { "id": "dom_abc", "name": "Sales Analytics", "datasetCount": 4 },
    { "id": "dom_def", "name": "Marketing Data", "datasetCount": 2 }
  ]
}

AI: "You have 2 data domains: Sales Analytics (4 datasets) and Marketing Data (2 datasets)."

You: "What's in Sales Analytics?"

AI calls: semaphor_list_datasets with domainId "dom_abc"
Response: {
  "datasets": [
    { "name": "orders", "label": "Orders", "description": "Customer orders" },
    { "name": "customers", "label": "Customers", "description": "Customer profiles" },
    { "name": "products", "label": "Products", "description": "Product catalog" },
    { "name": "regions", "label": "Regions", "description": "Geographic regions" }
  ]
}

AI: "Sales Analytics has 4 datasets: Orders, Customers, Products, and Regions."

You: "What fields does Orders have?"

AI calls: semaphor_get_dataset_schema with domainId "dom_abc", datasetName "orders"
Response: {
  "columns": [
    { "name": "order_id", "label": "Order ID", "type": "string", "fieldType": "dimension" },
    { "name": "order_date", "label": "Order Date", "type": "date", "fieldType": "dimension" },
    { "name": "total_amount", "label": "Total Amount", "type": "number", "fieldType": "measure" },
    { "name": "status", "label": "Status", "type": "string", "fieldType": "dimension" },
    { "name": "region", "label": "Region", "type": "string", "fieldType": "dimension" }
  ]
}

AI: "The Orders dataset has: Order ID, Order Date, Total Amount, Status, and Region."

Tips

  • Prefer the semantic path — Semantic domains provide curated, governed data with business-friendly labels. They're the recommended starting point.
  • Always list before querying — Call semaphor_list_datasets before semaphor_get_dataset_schema to avoid guessing dataset names.
  • Use domain relationships — When joining multiple datasets, semaphor_get_domain_relationships provides the correct join conditions automatically.
  • Cross-reference with dashboards — Call semaphor_list_dashboards and semaphor_get_dashboard_details to understand how data is already being visualized.

For more on semantic domains, see Semantic Domains.

On this page