Semaphor
Semantic DomainsTemplate Authoring

Relationships

Configure joins between datasets for automatic multi-table queries

Relationships define how datasets join together. They enable automatic join resolution -- when a user selects fields from multiple datasets, Semaphor finds the shortest join path automatically.

relationships:
  - name: orders_to_customers
    sourceDataset: orders
    sourceFields: [customer_id]
    targetDataset: customers
    targetFields: [id]
    cardinality: many_to_one
    defaultJoinType: LEFT
    isAutoJoin: true
    discoveredBy: user
    isActive: true

Relationship Field Reference

FieldTypeRequiredDescription
idstringNoUnique ID (auto-generated if omitted)
namestringYesDescriptive name (e.g., orders_to_customers)
sourceDatasetstringYesDataset name where the join starts
sourceFieldsstring[]YesColumn(s) in the source dataset
targetDatasetstringYesDataset name being joined to
targetFieldsstring[]YesColumn(s) in the target dataset
cardinalityenumNoone_to_one, one_to_many, many_to_one, many_to_many. Defaults to one_to_one
defaultJoinTypeenumYesINNER, LEFT, RIGHT, FULL
isAutoJoinbooleanYesMust be true for automatic join resolution
joinPrioritynumberNoHigher priority used first when multiple paths exist
discoveredByenumYesuser for hand-authored, ai for AI-discovered, fk_constraint for database FK
confidenceenumNohigh, medium, low
isActivebooleanYesSet to false to disable without deleting
descriptionstringNoExplanation of the relationship

Cardinality Guide

CardinalityMeaningExample
one_to_oneEach source row matches exactly one target rowusers to user_profiles
one_to_manyEach source row can match multiple target rowsorders to line_items
many_to_oneMultiple source rows match one target roworders to customers
many_to_manyBoth sides can have multiple matchesstudents to courses (via bridge table)

Setting cardinality correctly is important. Semaphor uses it for fan-out detection -- preventing metric inflation that happens when aggregating across one-to-many joins.


Star Schema Example

A classic fact-and-dimension star schema with 3 relationships:

relationships:
  - name: orders_to_customers
    sourceDataset: orders
    sourceFields: [customer_id]
    targetDataset: customers
    targetFields: [id]
    cardinality: many_to_one
    defaultJoinType: LEFT
    isAutoJoin: true
    discoveredBy: user
    isActive: true
 
  - name: orders_to_products
    sourceDataset: orders
    sourceFields: [product_id]
    targetDataset: products
    targetFields: [id]
    cardinality: many_to_one
    defaultJoinType: LEFT
    isAutoJoin: true
    discoveredBy: user
    isActive: true
 
  - name: orders_to_stores
    sourceDataset: orders
    sourceFields: [store_id]
    targetDataset: stores
    targetFields: [id]
    cardinality: many_to_one
    defaultJoinType: LEFT
    isAutoJoin: true
    discoveredBy: user
    isActive: true

Composite Key Joins

For joins on multiple columns, list all fields:

relationships:
  - name: sales_to_targets
    sourceDataset: sales
    sourceFields: [region, product_id, quarter]
    targetDataset: targets
    targetFields: [region, product_id, quarter]
    cardinality: many_to_one
    defaultJoinType: LEFT
    isAutoJoin: true
    discoveredBy: user
    isActive: true

Multi-Hop Joins

Semaphor supports up to 2-hop joins. For example, if you define:

  • orders -> stores (1 hop)
  • stores -> countries (1 hop)

Then users can query orders.amount grouped by countries.name (2 hops). Semaphor's BFS resolver will find the path: orders -> stores -> countries.

On this page