Agent Guide

Build semantic domain templates programmatically with complete schema reference and validation rules

This guide is for AI agents and automation scripts that generate domain templates as JSON. It provides the complete schema specification, construction rules, and validation checklist needed to produce valid templates programmatically.

Approach

Domain templates can be applied in two ways:

Code Editor -- Generate JSON, paste into the Domain Editor's Code Editor tab
Management API -- POST /api/management/v1/domains with the template in the request body

Both paths run the same Zod-based validation. A template that passes validation in one path will pass in the other.

Template JSON Schema

Top-Level Structure

{
  "schemaVersion": "2.0",
  "datasets": [],
  "relationships": [],
  "calculatedMetrics": {},
  "calculatedDimensions": {},
  "grainMappings": []
}

Field	Type	Required	Default	Description
`schemaVersion`	`"2.0"`	No	`"2.0"`	Schema version (always "2.0")
`datasets`	`SemanticDataset[]`	Yes	`[]`	Array of dataset definitions
`relationships`	`Relationship[]`	No	`[]`	Array of join definitions
`calculatedMetrics`	`Record<string, CalculatedMetric>`	No	`{}`	Named calculated metric definitions
`calculatedDimensions`	`Record<string, CalculatedDimension>`	No	`{}`	Named calculated dimension definitions
`grainMappings`	`GrainMapping[]`	No	`[]`	Cross-dataset dimension alignment

SemanticDataset

{
  "name": "orders",
  "label": "Orders",
  "description": "Customer order transactions",
  "type": "physical",
  "connectionId": "conn_abc123",
  "connectionType": "PostgreSQL",
  "catalog": null,
  "schema": "public",
  "table": "orders",
  "sql": null,
  "datamodelId": null,
  "primaryKey": ["order_id"],
  "fields": {
    "identifiers": [],
    "dimensions": [],
    "metrics": []
  }
}

Field	Type	Required	Constraints
`name`	string	Yes	Unique within domain
`label`	string	Yes	Display name
`description`	string	No
`type`	enum	Yes	`"physical"` or `"virtual"`
`connectionId`	string	Yes	Must match an existing connection
`connectionType`	enum	Yes	See connection types below
`catalog`	string	No	For BigQuery/Snowflake
`schema`	string	No	Database schema
`table`	string	Conditional	Required if `type: "physical"`
`sql`	string	Conditional	Required if `type: "virtual"`
`datamodelId`	string	No	Reference to a DataModel
`primaryKey`	string[]	No	Defaults to `[]`. Set for fan-out detection
`fields`	object	Yes	Contains `identifiers`, `dimensions`, `metrics`

Connection types: "PostgreSQL", "MySQL", "MSSQL", "BigQuery", "Redshift", "Snowflake", "clickhouse", "S3", "S3Tables", "GoogleSheets", "FileUpload", "API", "none"

IdentifierField

{
  "name": "order_id",
  "label": "Order ID",
  "description": "Unique order identifier",
  "dataType": "number",
  "type": "primary",
  "hidden": false,
  "tags": ["key"]
}

Field	Type	Required	Constraints
`name`	string	Yes	Column name
`label`	string	No	Display name
`description`	string	No
`dataType`	string	Yes	`"string"`, `"number"`, `"boolean"`, `"date"`, `"datetime"`, `"json"`, `"geo"`
`type`	enum	Yes	`"primary"` or `"foreign"`
`hidden`	boolean	No
`tags`	string[]	No

DimensionField

{
  "name": "order_date",
  "label": "Order Date",
  "dataType": "date",
  "granularity": "day",
  "isDisplayField": false,
  "dateFormat": null,
  "customFormat": null,
  "hidden": false,
  "tags": []
}

Field	Type	Required	Constraints
`name`	string	Yes	Column name
`label`	string	No	Display name
`dataType`	string	Yes	See data types above
`granularity`	enum	No	`"day"`, `"week"`, `"month"`, `"quarter"`, `"year"`, `"hour"`, `"minute"`, `"second"`
`isDisplayField`	boolean	No	Default display label for dataset
`dateFormat`	string	No	Display format for dates
`customFormat`	string	No	Custom format string
`hidden`	boolean	No
`tags`	string[]	No

MetricField

{
  "name": "amount",
  "label": "Revenue",
  "dataType": "number",
  "sourceField": "amount",
  "expression": null,
  "format": null,
  "unit": "USD",
  "isDefault": false,
  "hidden": false,
  "tags": []
}

Field	Type	Required	Constraints
`name`	string	Yes	Metric name
`label`	string	No	Display name
`dataType`	string	Yes	Usually `"number"`
`sourceField`	string	Conditional	Physical column. Required if no `expression`
`expression`	string	Conditional	SQL expression. Required if no `sourceField`
`format`	string	No	Display format hint
`unit`	string	No	Unit label
`isDefault`	boolean	No	Mark as default metric
`hidden`	boolean	No
`filters`	array	No	Default metric filters
`tags`	string[]	No

Relationship

{
  "id": "rel_001",
  "name": "orders_to_customers",
  "sourceDataset": "orders",
  "sourceFields": ["customer_id"],
  "targetDataset": "customers",
  "targetFields": ["id"],
  "cardinality": "many_to_one",
  "defaultJoinType": "LEFT",
  "isAutoJoin": true,
  "joinPriority": 1,
  "discoveredBy": "user",
  "confidence": "high",
  "isActive": true,
  "description": "Orders belong to customers"
}

Field	Type	Required	Constraints
`id`	string	No	Auto-generated if omitted
`name`	string	Yes	Unique name
`sourceDataset`	string	Yes	Must exist in `datasets`
`sourceFields`	string[]	Yes	Min 1 field
`targetDataset`	string	Yes	Must exist in `datasets`
`targetFields`	string[]	Yes	Min 1 field, same length as `sourceFields`
`cardinality`	enum	No	`"one_to_one"` (default), `"one_to_many"`, `"many_to_one"`, `"many_to_many"`
`defaultJoinType`	enum	Yes	`"INNER"`, `"LEFT"`, `"RIGHT"`, `"FULL"`
`isAutoJoin`	boolean	Yes	Must be `true` for automatic join resolution
`joinPriority`	number	No	Higher = preferred when multiple paths exist
`discoveredBy`	enum	Yes	`"user"`, `"auto"`, `"fk_constraint"`, `"ai"`
`confidence`	enum	No	`"high"`, `"medium"`, `"low"`
`isActive`	boolean	Yes	Set `false` to disable
`description`	string	No

CalculatedMetric

{
  "label": "Average Order Value",
  "description": "Revenue per order",
  "expression": "{total_revenue} / NULLIF({order_count}, 0)",
  "inputs": {
    "total_revenue": { "dataset": "orders", "field": "amount", "aggregate": "SUM" },
    "order_count": { "dataset": "orders", "field": "order_id", "aggregate": "COUNT" }
  },
  "metricType": "derived",
  "aggregationStrategy": "default",
  "format": { "type": "currency", "decimals": 2, "currency": "USD" },
  "tags": ["kpi"]
}

Field	Type	Required	Constraints
`label`	string	Yes	Display name
`description`	string	No
`expression`	string	Yes	Min 1 character. Uses `{input_name}` tokens
`inputs`	object	Yes	Map of name to `ColumnInputReference` or `MetricInputReference`
`metricType`	enum	No	`"base"`, `"derived"` (default), `"calculated"`
`aggregationStrategy`	enum	No	`"default"`, `"symmetric_aggregate"`, `"aggregate_then_join"`, `"weighted"`
`format`	FormatSpec	No	See below
`tags`	string[]	No
`filters`	array	No	Metric-level filters

Input types:

Input Type	Shape	Use Case
Column reference	`{ "dataset": "...", "field": "...", "aggregate": "SUM" }`	Points to a physical column
Metric reference	`{ "metric": "..." }`	Points to another calculated metric

Aggregate values: "SUM", "COUNT", "AVG", "MIN", "MAX", "MEDIAN", "DISTINCT"

CalculatedDimension

{
  "label": "Order Month",
  "description": "Monthly time grouping",
  "expression": "DATE_TRUNC('month', {order_date})",
  "inputs": {
    "order_date": { "dataset": "orders", "field": "order_date" }
  },
  "dataType": "date"
}

Field	Type	Required	Constraints
`label`	string	Yes	Display name
`description`	string	No
`expression`	string	Yes	SQL expression with `{input_name}` tokens
`inputs`	object	Yes	Map of name to `ColumnInputReference` or `DimensionInputReference`
`dataType`	enum	No	`"string"`, `"number"`, `"boolean"`, `"date"`, `"datetime"`
`grainMapping`	object	No	Map of dataset name to column name

Input types:

Input Type	Shape	Use Case
Column reference	`{ "dataset": "...", "field": "..." }`	Points to a physical column (no aggregate)
Dimension reference	`{ "dimension": "..." }`	Points to another calculated dimension

FormatSpec

{
  "type": "currency",
  "decimals": 2,
  "currency": "USD",
  "prefix": "$",
  "suffix": "",
  "thousandsSeparator": true
}

Field	Type	Required	Constraints
`type`	enum	Yes	`"number"`, `"currency"`, `"percentage"`, `"date"`, `"string"`
`decimals`	integer	No	0-10
`currency`	string	No	Currency code (e.g., `"USD"`)
`prefix`	string	No
`suffix`	string	No
`thousandsSeparator`	boolean	No	Default `true`

GrainMapping

{
  "sourceDimension": "order_month",
  "targetDataset": "returns",
  "targetColumn": "return_date"
}

Field	Type	Required	Description
`sourceDimension`	string	Yes	Name of a calculated dimension
`targetDataset`	string	Yes	Dataset to map the dimension to
`targetColumn`	string	Yes	Column in the target dataset

Construction Rules

Follow these rules to produce valid templates:

Naming

Dataset name: unique within the domain, use snake_case
Calculated metric/dimension keys: unique, use snake_case
Field name: must match the physical column name in the database
Relationship name: descriptive (e.g., orders_to_customers)

Required Field Combinations

Physical datasets must have table
Virtual datasets must have sql
Metrics must have either sourceField or expression (not both required, but at least one)
Identifiers must have type set to "primary" or "foreign"

Token Rules

Token names in expressions ({name}) must exactly match keys in the inputs map
Token names are case-sensitive
Every token in the expression must have a corresponding input
Every input should be referenced by at least one token (unused inputs produce warning E006)
No braces {} inside SQL string literals (parser limitation)
No escape sequences like {{ or }}

Aggregation Rules

If the expression wraps a token in an aggregate (SUM({amount})), the input does not need aggregate
If the expression uses a bare token ({total_revenue} / {order_count}), each input must specify aggregate
Metric references ({ "metric": "..." }) never need aggregate -- the referenced metric provides its own

Reference Rules

Calculated metrics can reference: column references and other metric references
Calculated dimensions can reference: column references and other dimension references
Calculated dimensions cannot reference metrics (error E007)
Maximum nesting depth: 10 levels
No circular references (error E001)

Relationship Rules

sourceDataset and targetDataset must exist in datasets
sourceFields and targetFields must have the same length
Set isAutoJoin: true for fields to be auto-joinable in the explorer
Set isActive: true for the relationship to be used

Validation Checklist

Before submitting a template, verify:

Validation Error Codes

Code	Severity	Cause
E001	Error	Circular dependency in metric/dimension references
E002	Error	Input references unknown metric name
E003	Error	Input references unknown field in dataset
E004	Error	Input references unknown dataset
E005	Error	Token in expression not found in inputs
E006	Warning	Input defined but not used in expression
E007	Error	Calculated dimension references a metric
E008	Error	Invalid filter syntax on metric
E009	Error	Invalid expression (braces in string literals)
E010	Error	Input requires aggregate but none specified
E011	Error	Input references unknown calculated dimension
W001	Warning	Dataset missing primary key (fan-out detection disabled)
W002	Warning	Relationship may cause metric inflation

Generating Templates from Database Metadata

Here's a step-by-step algorithm for AI agents generating a domain from a database schema:

Step 1: Map Tables to Datasets

For each table in the target schema:

{
  "name": "<table_name_snake_case>",
  "label": "<Human Readable Name>",
  "type": "physical",
  "connectionId": "<connection_id>",
  "connectionType": "<database_type>",
  "schema": "<schema_name>",
  "table": "<table_name>",
  "primaryKey": ["<primary_key_column>"],
  "fields": { "identifiers": [], "dimensions": [], "metrics": [] }
}

Step 2: Classify Columns into Fields

For each column in each table:

Column Type	Classification	Rule
Primary key	Identifier (`type: "primary"`)	Always
Foreign key	Identifier (`type: "foreign"`)	Always
Numeric, non-key	Metric	Set `sourceField` to column name
Date/DateTime	Dimension	Set `granularity: "day"` for dates
String/Boolean	Dimension	Default classification

Step 3: Detect Relationships

For each foreign key constraint:

{
  "name": "<source_table>_to_<target_table>",
  "sourceDataset": "<source_table>",
  "sourceFields": ["<fk_column>"],
  "targetDataset": "<target_table>",
  "targetFields": ["<pk_column>"],
  "cardinality": "many_to_one",
  "defaultJoinType": "LEFT",
  "isAutoJoin": true,
  "discoveredBy": "auto",
  "isActive": true
}

Step 4: Generate Common KPIs

Based on the metric fields, create standard calculated metrics:

{
  "total_<metric_name>": {
    "label": "Total <Metric Label>",
    "expression": "SUM({<metric_name>})",
    "inputs": {
      "<metric_name>": { "dataset": "<dataset>", "field": "<field>" }
    }
  }
}

Step 5: Validate

Run the template through validation and fix any errors before submission.

Complete JSON Example

A production-ready template for an e-commerce analytics domain:

{
  "schemaVersion": "2.0",
  "datasets": [
    {
      "name": "orders",
      "label": "Orders",
      "type": "physical",
      "connectionId": "conn_1",
      "connectionType": "PostgreSQL",
      "schema": "public",
      "table": "orders",
      "primaryKey": ["order_id"],
      "fields": {
        "identifiers": [
          { "name": "order_id", "dataType": "number", "type": "primary", "label": "Order ID" },
          { "name": "customer_id", "dataType": "number", "type": "foreign", "label": "Customer ID" }
        ],
        "dimensions": [
          { "name": "order_date", "dataType": "date", "label": "Order Date", "granularity": "day" },
          { "name": "status", "dataType": "string", "label": "Status" }
        ],
        "metrics": [
          { "name": "amount", "dataType": "number", "sourceField": "amount", "label": "Revenue" },
          { "name": "cost", "dataType": "number", "sourceField": "cost", "label": "Cost" }
        ]
      }
    },
    {
      "name": "customers",
      "label": "Customers",
      "type": "physical",
      "connectionId": "conn_1",
      "connectionType": "PostgreSQL",
      "schema": "public",
      "table": "customers",
      "primaryKey": ["id"],
      "fields": {
        "identifiers": [
          { "name": "id", "dataType": "number", "type": "primary", "label": "Customer ID" }
        ],
        "dimensions": [
          { "name": "segment", "dataType": "string", "label": "Customer Segment" },
          { "name": "region", "dataType": "string", "label": "Region" }
        ],
        "metrics": []
      }
    },
    {
      "name": "returns",
      "label": "Returns",
      "type": "physical",
      "connectionId": "conn_1",
      "connectionType": "PostgreSQL",
      "schema": "public",
      "table": "returns",
      "primaryKey": [],
      "fields": {
        "identifiers": [],
        "dimensions": [
          { "name": "return_date", "dataType": "date", "label": "Return Date" }
        ],
        "metrics": [
          { "name": "amount", "dataType": "number", "sourceField": "amount", "label": "Return Amount" }
        ]
      }
    }
  ],
  "relationships": [
    {
      "name": "orders_to_customers",
      "sourceDataset": "orders",
      "sourceFields": ["customer_id"],
      "targetDataset": "customers",
      "targetFields": ["id"],
      "cardinality": "many_to_one",
      "defaultJoinType": "LEFT",
      "isAutoJoin": true,
      "discoveredBy": "user",
      "isActive": true
    }
  ],
  "calculatedMetrics": {
    "revenue": {
      "label": "Total Revenue",
      "expression": "SUM({amount})",
      "inputs": {
        "amount": { "dataset": "orders", "field": "amount" }
      }
    },
    "cost": {
      "label": "Total Cost",
      "expression": "SUM({cost_amount})",
      "inputs": {
        "cost_amount": { "dataset": "orders", "field": "cost" }
      }
    },
    "profit": {
      "label": "Profit",
      "expression": "{revenue} - {cost}",
      "inputs": {
        "revenue": { "metric": "revenue" },
        "cost": { "metric": "cost" }
      }
    },
    "profit_margin": {
      "label": "Profit Margin %",
      "expression": "{profit} / NULLIF({revenue}, 0) * 100",
      "inputs": {
        "profit": { "metric": "profit" },
        "revenue": { "metric": "revenue" }
      },
      "format": { "type": "percentage", "decimals": 1 }
    },
    "net_revenue": {
      "label": "Net Revenue",
      "description": "Revenue minus returns",
      "expression": "{total_sales} - COALESCE({total_returns}, 0)",
      "inputs": {
        "total_sales": { "dataset": "orders", "field": "amount", "aggregate": "SUM" },
        "total_returns": { "dataset": "returns", "field": "amount", "aggregate": "SUM" }
      }
    }
  },
  "calculatedDimensions": {
    "order_month": {
      "label": "Order Month",
      "expression": "DATE_TRUNC('month', {order_date})",
      "inputs": {
        "order_date": { "dataset": "orders", "field": "order_date" }
      },
      "dataType": "date"
    },
    "order_size": {
      "label": "Order Size",
      "expression": "CASE WHEN {amount} > 1000 THEN 'Large' WHEN {amount} > 100 THEN 'Medium' ELSE 'Small' END",
      "inputs": {
        "amount": { "dataset": "orders", "field": "amount" }
      },
      "dataType": "string"
    }
  },
  "grainMappings": [
    {
      "sourceDimension": "order_month",
      "targetDataset": "returns",
      "targetColumn": "return_date"
    }
  ]
}

Next Steps

Template Guide -- Step-by-step YAML/JSON tutorial
Template Guide -- Step-by-step YAML/JSON tutorial with SQL translation examples
Access Control -- Control domain access via tokens

Agent Guide

On this page