Best Practices
Essential guidelines for structuring efficient and maintainable queries in Semaphor
Introduction
Semaphor provides powerful free-form query capabilities, offering flexibility beyond drag-and-drop tools. While this flexibility gives you more control, it also requires some attention to query structure to ensure your dashboards are:
- Reliable and performant
- Properly isolated between tenants
- Easy to maintain and debug
This guide covers essential best practices that will help you write better Semaphor queries, avoid common pitfalls, and make the most of the platform's capabilities.
1. Use Fully Qualified Table Names
Referencing tables without specifying the schema can lead to ambiguity. In multi-schema environments, the database might resolve table names unexpectedly, causing errors or unintended behavior.
🚫 Example of an ambiguous query
Even though this query is functionally correct, and speeds up ad-hoc analysis, but may not be ideal if the card subscribes to dashboard filters.
⚠️ Issues with this approach
- In multi-schema environments, it's unclear which schema the users table belongs to.
- Semaphor may not be able to resolve the correct tenant table for multi-tenant dashboards.
- Less context for AI assistant to provide refinements and helpful suggestions.
✅ Better
Why This Matters:
- Prevents ambiguity. Ensures the correct table is referenced.
- Supports multi-tenancy. Avoids issues with schema-based data separation.
- Improves maintainability. Makes the query more readable and predictable.
- Optimizes performance. Helps Semaphor's query parser work more efficiently.
2. Use Table Aliases
When no alias is used, Semaphor applies the fully qualified column name in the WHERE clause. While this works for most databases, some—like BigQuery—do not allow fully qualified column names in WHERE conditions.
🚫 Without Table Alias
If you apply a filter (country = 'US'), Semaphor generates a query with the fully qualified column name sales.sales_data.country = 'US'
in the WHERE clause.
This works fine for most databases, but some databases, for example like BigQuery will not allow you to use the fully-qualified-column-name in the WHERE clause.
✅ Better (With Alias)
Use table aliases to simplify queries and ensure compatibility across different databases.
Now, when you apply the filter, Semaphor generates a query with the alias s
in the WHERE clause s.country = 'US'
.
Why This Works:
- The alias(s) replaces the fully qualified column name in
WHERE
clause, ensuring compatibility across different databases. - Prevents errors in databases that don’t support fully qualified column names in
WHERE
clause.
3. Use Data Models to Simplify Queries
If you find yourself making the same joins over and over again, you can create a data model that pre-joins the tables you need.
🚫 Example (Complex Query with Repeated Joins)
⚠️ Issues with this approach: The query is long and repetitive. Every time you need similar data, you must rewrite the joins. This leads to complex, repetitive logic that is harder to read and maintain.
✅ Better (Using a CTE for a Data Model)
Why this is better:
- The user_orders CTE (Common Table Expression) pre-joins the required tables.
- Queries become simpler and easier to maintain.
- The same logic is reusable without repeating joins.
Using Data Models in Semaphor
You can create a data model in the semaphor console. Once created, you can reference the data model under the dm
namespace:
Benefits of Data Models:
- Improves efficiency by avoiding redundant joins.
- Enhances maintainability—update the data model once, and all queries using it benefit.
- Enables cleaner SQL, making queries more readable.