CREATE SEMANTIC VIEW¶
Creates a semantic view definition, registering dimensions, metrics, relationships, facts, and materializations for on-demand query expansion.
Syntax¶
Keyword body (AS):
CREATE [ OR REPLACE ] SEMANTIC VIEW [ IF NOT EXISTS ] <name> AS
TABLES (
<alias> AS <table_name>
[ PRIMARY KEY ( <column> [, <column> ...] ) ]
[ UNIQUE ( <column> [, <column> ...] ) ]
[ COMMENT = '<text>' ]
[ WITH SYNONYMS = ( '<synonym>' [, '<synonym>' ...] ) ]
[, ... ]
)
[ RELATIONSHIPS (
<rel_name> AS <from_alias>( <fk_column> [, <fk_column> ...] )
REFERENCES <to_alias> [( <ref_column> [, <ref_column> ...] )]
[, ... ]
) ]
[ FACTS (
[ PRIVATE ] <alias>.<fact_name> AS <row_level_expression>
[ COMMENT = '<text>' ]
[ WITH SYNONYMS = ( '<synonym>' [, '<synonym>' ...] ) ]
[, ... ]
) ]
[ DIMENSIONS (
<alias>.<dim_name> AS <expression>
[ COMMENT = '<text>' ]
[ WITH SYNONYMS = ( '<synonym>' [, '<synonym>' ...] ) ]
[, ... ]
) ]
[ METRICS (
[ PRIVATE ] <alias>.<metric_name>
[ USING ( <rel_name> [, <rel_name> ...] ) ]
[ NON ADDITIVE BY ( <dim_name> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
[, <dim_name> ... ] ) ]
AS <aggregate_expression>
[ COMMENT = '<text>' ]
[ WITH SYNONYMS = ( '<synonym>' [, '<synonym>' ...] ) ]
[, [ PRIVATE ] <metric_name>
[ NON ADDITIVE BY ( ... ) ]
AS <expression> ... ]
[, [ PRIVATE ] <alias>.<metric_name> AS
<window_func>( <inner_metric> [, <extra_arg> ...] )
OVER ( [ PARTITION BY EXCLUDING <dim_name> [, <dim_name> ...] ]
| [ PARTITION BY <dim_name> [, <dim_name> ...] ]
[ ORDER BY <dim_name> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
[, <dim_name> ...] ]
[ <frame_clause> ] )
[ COMMENT = '<text>' ]
[ WITH SYNONYMS = ( '<synonym>' [, '<synonym>' ...] ) ] ]
) ]
[ MATERIALIZATIONS (
<mat_name> AS (
TABLE <table_name>,
[ DIMENSIONS ( <dim_name> [, <dim_name> ...] ) , ]
[ METRICS ( <metric_name> [, <metric_name> ...] ) ]
)
[, ... ]
) ]
YAML body (FROM YAML):
Added in version 0.7.0.
CREATE [ OR REPLACE ] SEMANTIC VIEW [ IF NOT EXISTS ] <name>
FROM YAML $$ <yaml_content> $$
CREATE [ OR REPLACE ] SEMANTIC VIEW [ IF NOT EXISTS ] <name>
FROM YAML FILE '<file_path>'
The FROM YAML variant accepts a YAML definition in a dollar-quoted string ($$...$$ or $tag$...$tag$). The FROM YAML FILE variant reads the YAML definition from a file at the given path.
Statement Variants¶
CREATE SEMANTIC VIEW <name> AS ...Creates a new semantic view. Returns an error if a view with the same name already exists.
CREATE OR REPLACE SEMANTIC VIEW <name> AS ...Creates or replaces an existing semantic view with the same name. If the view does not exist, creates it. If it does, replaces the definition.
CREATE SEMANTIC VIEW IF NOT EXISTS <name> AS ...Creates a new semantic view only if no view with the same name exists. If a view with the name already exists, the statement succeeds silently without modifying it.
All three variants work with both the AS keyword body and the FROM YAML / FROM YAML FILE body.
Note
All four CREATE body variants participate in your surrounding transaction. BEGIN ... ROLLBACK discards an uncommitted CREATE. See Transactional DDL and Known Limitations.
Note
CREATE SEMANTIC VIEW IF NOT EXISTS reliably absorbs duplicates within a single process or transaction (re-running a setup script, repeated statements in the same BEGIN block). It cannot absorb a race between two separate processes that both run CREATE IF NOT EXISTS against the same database at the same time – one will succeed and the other will see a constraint error. Concurrent DDL across processes is unusual for typical DuckDB workloads, so most users will never hit this. See CREATE IF NOT EXISTS Across Multiple Connections for the workaround.
Note
Requires a writable database. On a read-only database this statement fails with DuckDB’s standard Cannot execute statement of type "..." which is attached in read-only mode! error. See Read-Only Databases.
Clauses¶
Clauses must appear in the following order: TABLES, RELATIONSHIPS, FACTS, DIMENSIONS, METRICS, MATERIALIZATIONS. TABLES is required. At least one of DIMENSIONS or METRICS is required. All other clauses are optional.
TABLES¶
Declares the physical tables available to the semantic view. Each entry assigns an alias and maps it to a physical table. A primary key declaration is optional but recommended for multi-table views (it drives JOIN synthesis and cardinality inference).
TABLES (
o AS orders PRIMARY KEY (id) COMMENT = 'Order transactions',
c AS customers PRIMARY KEY (id) WITH SYNONYMS = ('clients', 'buyers')
)
Parameters:
<alias>, a short name used to reference this table in all other clauses.<table_name>, the physical table name. Supports catalog-qualified names (catalog.schema.table).PRIMARY KEY (<column>, ...), optional. One or more columns forming the table’s primary key. Used for JOIN synthesis and cardinality inference. This is semantic metadata, not a DuckDB constraint. Omit for fact tables that do not need to be join targets.COMMENT = '<text>', optional. A human-readable description of the table.WITH SYNONYMS = ('<synonym>', ...), optional. Alternative names for discoverability. Must come after COMMENT if both are present.
Optional: UNIQUE constraints:
TABLES (
o AS orders PRIMARY KEY (id) UNIQUE (email)
)
UNIQUE (<column>, ...) declares additional unique constraints. Used for cardinality inference: if a relationship’s FK columns match a UNIQUE constraint, the relationship is inferred as one-to-one.
The first table in the TABLES clause is the base table (the root of the relationship graph). All other tables must be reachable from the base table through declared relationships.
RELATIONSHIPS¶
Declares FK/PK join paths between tables. Each entry names a relationship, specifies the FK columns on the “from” side, and references the target table alias.
RELATIONSHIPS (
order_customer AS o(customer_id) REFERENCES c,
order_product AS o(product_id) REFERENCES p
)
Parameters:
<rel_name>, a unique name identifying this relationship.<from_alias>, the table alias containing the FK columns.(<fk_column>, ...), one or more FK column names on the “from” table.REFERENCES <to_alias> [(<ref_column>, ...)], the target table alias. Optionally specify which columns on the target table to join against. If omitted, the target’sPRIMARY KEYcolumns are used. The JOIN ON clause is synthesized asfrom_alias.fk_column = to_alias.ref_column.
Cardinality inference:
The extension infers cardinality from the “from” table’s constraints:
If the FK columns match a
PRIMARY KEYorUNIQUEconstraint on the “from” table, the relationship is one-to-one.Otherwise, the relationship is many-to-one (the default).
Cardinality is used for fan trap detection.
Validation rules:
The relationship graph must form a tree rooted at the base table.
Cycles are rejected.
Diamond patterns (multiple paths to the same table) are rejected unless all paths use named relationships (role-playing pattern).
Self-references (
from_aliasequalsto_alias) are rejected.Orphan tables (declared in
TABLESbut not reachable via relationships) are rejected in multi-table views.
FACTS¶
Declares named row-level expressions. Facts are inlined into metric expressions at expansion time.
FACTS (
li.net_price AS li.extended_price * (1 - li.discount)
COMMENT = 'Price after discount',
PRIVATE li.internal_cost AS li.unit_cost * li.quantity,
li.tax_amount AS li.net_price * li.tax_rate
WITH SYNONYMS = ('tax', 'vat')
)
Parameters:
PRIVATE, optional. When present, the fact cannot be queried directly viafacts := [...]but can still be referenced in metric expressions.<alias>.<fact_name>, the table alias and fact name, appearing beforeAS. Facts are scoped to a single table. The fact name is what you query (facts := ['net_price']) and whatDESCRIBEreturns as the output column. As in Snowflake, the entry readsname AS expression— the reverse of a plain SQLexpression AS alias.<row_level_expression>, the SQL expression afterAS: any expression that operates on individual rows. Must not contain aggregate functions. A fact may be named after its own backing column (s.unit_price AS s.unit_price), giving a passthrough fact.COMMENT = '<text>', optional. A human-readable description.WITH SYNONYMS = ('<synonym>', ...), optional. Alternative names for discoverability.
Fact chaining:
Facts can reference other facts by name. The extension resolves dependencies in topological order and inlines them recursively. Circular references between distinct facts are rejected at define time. A fact whose expression contains its own name (s.unit_price AS s.unit_price) is treated as a reference to the physical column, not a self-cycle.
Validation rules:
Aggregate functions (
SUM,COUNT,AVG,MIN,MAX, etc.) in fact expressions are rejected.Circular references between distinct facts are rejected (a fact referencing its own name resolves to the physical column).
DIMENSIONS¶
Declares named grouping expressions available for queries.
DIMENSIONS (
o.region AS o.region COMMENT = 'Sales region',
o.category AS o.category WITH SYNONYMS = ('product_category'),
o.month AS date_trunc('month', o.ordered_at)
)
Parameters:
<alias>.<dim_name>, the table alias and dimension name. The alias indicates which table the dimension comes from (used for join dependency resolution).<expression>, any SQL expression. Can be a simple column reference (o.region) or a computed expression (date_trunc('month', o.ordered_at)).COMMENT = '<text>', optional. A human-readable description.WITH SYNONYMS = ('<synonym>', ...), optional. Alternative names for discoverability.
Note
Column qualification in expressions. The <alias>.<dim_name> on the left of AS (the logical name) must always carry its table-alias prefix. Column references inside the expression on the right of AS may be either qualified (o.region) or unqualified (region). Unqualified references resolve fine in single-table views, but in multi-table views they can raise an ambiguity error when more than one joined table exposes the same column name. Qualifying expression columns (alias.column) is recommended — it is unambiguous in every case, which is why all examples in these docs use it. The same rule applies to METRICS and FACTS expressions.
Validation rules:
Dimension names must be unique within the view (case-insensitive). For example,
regionandRegioncannot both appear in the sameDIMENSIONSclause. See Duplicate dimension name.A dimension name cannot collide with any metric name (case-insensitive). See Duplicate dimension name.
Type inference:
When creating a semantic view on a file-backed database, the extension infers the DATA_TYPE for each dimension at define time. It executes a LIMIT 0 query against the underlying tables to detect the output type. Most scalar types are inferred, including VARCHAR, INTEGER, BIGINT, DOUBLE, DATE, TIMESTAMP, BOOLEAN, FLOAT, UUID, and others. Types where inference would be lossy, such as DECIMAL, LIST, and ARRAY, show an empty data type.
The inferred type is visible in SHOW SEMANTIC DIMENSIONS (data_type column) and DESCRIBE SEMANTIC VIEW (output_type field in the JSON).
Note
Type inference requires a file-backed database. In-memory databases skip inference, leaving DATA_TYPE empty.
METRICS¶
Declares named aggregation expressions, derived metrics, semi-additive metrics, and window metrics.
Base metrics (with table alias, containing aggregate functions):
METRICS (
o.revenue AS SUM(o.amount) COMMENT = 'Total revenue',
o.order_count AS COUNT(*) WITH SYNONYMS = ('num_orders')
)
Derived metrics (no table alias, referencing other metric names):
METRICS (
li.revenue AS SUM(li.net_price),
li.cost AS SUM(li.unit_cost),
profit AS revenue - cost,
margin AS profit / revenue * 100
)
USING clause (for role-playing dimensions):
METRICS (
f.departures USING (dep_airport) AS COUNT(*),
f.arrivals USING (arr_airport) AS COUNT(*)
)
PRIVATE metrics:
METRICS (
PRIVATE o.raw_total AS SUM(o.amount),
net_total AS raw_total * 0.9
)
Semi-additive metrics (with NON ADDITIVE BY):
METRICS (
a.total_balance AS SUM(a.balance)
NON ADDITIVE BY (report_date DESC NULLS FIRST)
)
When a query does not include the non-additive dimension, the extension generates a CTE with ROW_NUMBER to select one snapshot row per group before aggregating. When the non-additive dimension is included in the query, the metric behaves as a standard additive metric.
See How to Use Semi-Additive Metrics for details.
Window metrics (with OVER clause):
Window metrics wrap another metric in a SQL window function. The OVER clause supports two mutually exclusive partitioning modes:
PARTITION BY EXCLUDING computes the partition set dynamically at query time as “all queried dimensions minus the excluded ones.” The partition changes depending on which dimensions are requested:
METRICS (
s.total_qty AS SUM(s.quantity),
s.rolling_avg AS
AVG(total_qty) OVER (PARTITION BY EXCLUDING date
ORDER BY date NULLS LAST
RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW)
)
PARTITION BY (without EXCLUDING) specifies an explicit, fixed set of partition dimensions. The partition set is always exactly the listed dimensions, regardless of what other dimensions are queried:
METRICS (
s.total_qty AS SUM(s.quantity),
s.store_avg AS
AVG(total_qty) OVER (PARTITION BY store ORDER BY date NULLS LAST)
)
See How to Use Window Function Metrics for details on both modes.
Warning
NON ADDITIVE BY and OVER cannot be combined on the same metric. A metric is either semi-additive or a window metric, not both.
Parameters:
PRIVATE, optional. When present, the metric cannot be queried directly but can be referenced by derived metric expressions.<alias>.<metric_name>, table alias and metric name for base metrics.<metric_name>, name only (no alias) for derived metrics.USING (<rel_name>, ...), optional. Specifies which named relationship(s) this metric traverses. Used to disambiguate when a dimension comes from a role-playing table.NON ADDITIVE BY (<dim_name> [ASC|DESC] [NULLS FIRST|NULLS LAST], ...), optional. Declares the metric as semi-additive, specifying which dimensions trigger snapshot selection.<aggregate_expression>, for base metrics: any expression containing aggregate functions.<expression>, for derived metrics: an expression referencing other metric names (no aggregate functions).<window_func>(<inner_metric>, ...) OVER (...), for window metrics: wraps another metric in a window function.COMMENT = '<text>', optional. A human-readable description.WITH SYNONYMS = ('<synonym>', ...), optional. Alternative names for discoverability.
Validation rules:
Metric names must be unique within the view (case-insensitive), across both base and derived metrics. See Duplicate dimension name.
A metric name cannot collide with any dimension name (case-insensitive). See Duplicate dimension name.
Derived metrics must not contain aggregate functions.
Circular derived metric references are rejected.
USINGrelationship names must match declared relationships.NON ADDITIVE BYdimension names must match declared dimensions.Window metric inner metric name must match a declared metric.
Window metric
EXCLUDINGdimension names must match declared dimensions.Window metric
PARTITION BYdimension names must match declared dimensions.Window metric
ORDER BYdimension names must match declared dimensions.NON ADDITIVE BYandOVERcannot both appear on the same metric.OVERcannot appear on a derived metric (one without a table alias). Only qualified metrics (alias.name) can useOVER.
MATERIALIZATIONS¶
Added in version 0.7.0.
Declares named materializations that map pre-aggregated tables to the dimensions and metrics they cover. When a query’s requested dimensions and metrics exactly match a materialization, the extension routes to the pre-aggregated table instead of expanding raw sources.
MATERIALIZATIONS (
region_agg AS (
TABLE daily_revenue_by_region,
DIMENSIONS (region),
METRICS (revenue, order_count)
),
global_agg AS (
TABLE global_totals,
METRICS (revenue)
)
)
Parameters:
<mat_name>, a unique name identifying this materialization.TABLE <table_name>, the physical table containing pre-aggregated data. Supports catalog-qualified names (catalog.schema.table). The table is not validated for existence at define time (it may be created later by external tools like dbt).DIMENSIONS (<dim_name>, ...), optional. Dimension names from the view’sDIMENSIONSclause that the materialization table covers.METRICS (<metric_name>, ...), optional. Metric names from the view’sMETRICSclause that the materialization table covers.
At least one of DIMENSIONS or METRICS must be specified in each materialization entry.
Routing behavior:
Routing uses exact match: both the dimension set and metric set must exactly equal the query’s requested sets (case-insensitive comparison).
Materializations are scanned in definition order; the first match wins.
Semi-additive metrics (
NON ADDITIVE BY) and window metrics (OVER) are always excluded from routing.When no match is found, the extension falls back to standard expansion.
See How to Use Materializations for a detailed guide.
Validation rules:
Materialization names must be unique within a view.
Dimension names must match declared dimensions in the view’s
DIMENSIONSclause.Metric names must match declared metrics in the view’s
METRICSclause.Each materialization must specify at least one of
DIMENSIONSorMETRICS.
FROM YAML¶
Added in version 0.7.0.
Creates a semantic view from a YAML definition instead of the keyword-based AS body. Two forms are supported:
Inline YAML (dollar-quoted):
CREATE SEMANTIC VIEW order_metrics FROM YAML $$
tables:
- alias: o
table: orders
pk_columns:
- id
dimensions:
- name: region
expr: o.region
source_table: o
metrics:
- name: revenue
expr: SUM(o.amount)
source_table: o
$$
The YAML content is enclosed in dollar-quote delimiters. Tagged dollar-quoting ($yaml$...$yaml$) is also supported.
YAML from file:
CREATE SEMANTIC VIEW order_metrics FROM YAML FILE '/path/to/definition.yaml'
The file path must be single-quoted. DuckDB reads the file contents and parses as YAML.
See How to Import and Export YAML Definitions for a detailed workflow guide.
YAML size limit: YAML definitions are capped at 1 MiB. Definitions exceeding this limit are rejected with an error.
Type inference:
As with dimensions, the extension infers DATA_TYPE for each metric at define time on file-backed databases. COUNT(*) infers as BIGINT, SUM on integer columns infers as BIGINT, SUM on float/double columns infers as DOUBLE, and so on. The inferred type is visible in SHOW SEMANTIC METRICS and DESCRIBE SEMANTIC VIEW.
Examples¶
Single-table view:
CREATE SEMANTIC VIEW order_metrics AS
TABLES (
o AS orders PRIMARY KEY (id)
)
DIMENSIONS (
o.region AS o.region
)
METRICS (
o.revenue AS SUM(o.amount)
);
Multi-table star schema:
CREATE SEMANTIC VIEW analytics AS
TABLES (
o AS orders PRIMARY KEY (id),
c AS customers PRIMARY KEY (id),
p AS products PRIMARY KEY (id)
)
RELATIONSHIPS (
order_customer AS o(customer_id) REFERENCES c,
order_product AS o(product_id) REFERENCES p
)
DIMENSIONS (
c.customer AS c.name,
p.product AS p.name,
o.region AS o.region
)
METRICS (
o.revenue AS SUM(o.amount),
o.order_count AS COUNT(*)
);
Full feature set (facts, derived metrics, USING):
CREATE SEMANTIC VIEW flight_analytics AS
TABLES (
f AS flights PRIMARY KEY (flight_id),
a AS airports PRIMARY KEY (airport_code)
)
RELATIONSHIPS (
dep_airport AS f(departure_code) REFERENCES a,
arr_airport AS f(arrival_code) REFERENCES a
)
FACTS (
f.is_international AS CASE WHEN f.departure_country != f.arrival_country
THEN 1 ELSE 0 END
)
DIMENSIONS (
a.city AS a.city,
f.carrier AS f.carrier
)
METRICS (
f.departures USING (dep_airport) AS COUNT(*),
f.arrivals USING (arr_airport) AS COUNT(*),
total_flights AS departures + arrivals
);
With metadata annotations:
CREATE SEMANTIC VIEW sales AS
TABLES (
li AS line_items PRIMARY KEY (id) COMMENT = 'Transaction line items'
)
FACTS (
li.net_price AS li.extended_price * (1 - li.discount)
COMMENT = 'Price after discount' WITH SYNONYMS = ('discounted_price')
)
DIMENSIONS (
li.region AS li.region COMMENT = 'Sales territory'
)
METRICS (
li.total_net AS SUM(li.net_price) COMMENT = 'Net revenue'
);
Semi-additive metric:
CREATE SEMANTIC VIEW account_metrics AS
TABLES (
a AS accounts PRIMARY KEY (id)
)
DIMENSIONS (
a.customer_id AS a.customer_id,
a.report_date AS a.report_date
)
METRICS (
a.total_balance AS SUM(a.balance)
NON ADDITIVE BY (report_date DESC NULLS FIRST)
);
Window metric with PARTITION BY EXCLUDING:
CREATE SEMANTIC VIEW store_analytics AS
TABLES (
s AS sales PRIMARY KEY (id)
)
DIMENSIONS (
s.store AS s.store,
s.date AS s.sale_date
)
METRICS (
s.total_qty AS SUM(s.quantity),
s.rolling_avg AS
AVG(total_qty) OVER (PARTITION BY EXCLUDING date
ORDER BY date NULLS LAST)
);
Window metric with explicit PARTITION BY:
CREATE SEMANTIC VIEW store_analytics AS
TABLES (
s AS sales PRIMARY KEY (id)
)
DIMENSIONS (
s.store AS s.store,
s.date AS s.sale_date,
s.region AS s.region
)
METRICS (
s.total_qty AS SUM(s.quantity),
s.store_avg AS
AVG(total_qty) OVER (PARTITION BY store
ORDER BY date NULLS LAST)
);
With materializations:
Added in version 0.7.0.
CREATE SEMANTIC VIEW order_metrics AS
TABLES (
o AS orders PRIMARY KEY (id)
)
DIMENSIONS (
o.region AS o.region,
o.status AS o.status
)
METRICS (
o.revenue AS SUM(o.amount),
o.order_count AS COUNT(*)
)
MATERIALIZATIONS (
region_agg AS (
TABLE daily_revenue_by_region,
DIMENSIONS (region),
METRICS (revenue, order_count)
),
region_status_agg AS (
TABLE revenue_by_region_status,
DIMENSIONS (region, status),
METRICS (revenue)
)
);
From inline YAML:
Added in version 0.7.0.
CREATE SEMANTIC VIEW order_metrics FROM YAML $$
tables:
- alias: o
table: orders
pk_columns:
- id
dimensions:
- name: region
expr: o.region
source_table: o
metrics:
- name: revenue
expr: SUM(o.amount)
source_table: o
$$
From YAML file:
Added in version 0.7.0.
CREATE SEMANTIC VIEW order_metrics FROM YAML FILE '/path/to/definition.yaml'