How to generate Semolina model classes from warehouse views¶
Already have a Snowflake semantic view or Databricks metric view set up? semolina codegen
introspects it and prints a Python model class to stdout. You can drop that output straight
into your codebase.
Run codegen¶
semolina codegen my_schema.sales_view --backend snowflake
That connects to your warehouse, reads the view’s column metadata, and prints a ready-to-use
SemanticView subclass.
Introspect multiple views at once¶
Pass multiple view names in a single call:
semolina codegen schema.sales_view schema.orders_view --backend databricks
All classes appear in one output block with a single shared imports section.
Pipe output to a file¶
semolina codegen my_schema.sales_view --backend snowflake > models.py
There is no --output flag; redirect stdout as you would with any CLI tool.
Choose a backend¶
Use --backend (or -b):
Value |
Warehouse |
Introspects via |
|---|---|---|
|
Snowflake semantic views |
|
|
Databricks metric views |
|
Credentials come from environment variables
(for example, SNOWFLAKE_ACCOUNT for Snowflake).
See How to configure codegen credentials for the full list of
environment variables, .env file setup, and config
file fallback.
Understand the generated output¶
Given this semantic view in your warehouse:
CREATE OR REPLACE SEMANTIC VIEW analytics.sales_view
TABLES (
s AS source_table PRIMARY KEY (id)
)
DIMENSIONS (
s.country AS country,
s.unit_price AS unit_price
)
METRICS (
s.revenue AS SUM(revenue)
)
;
Running:
semolina codegen analytics.sales_view --backend snowflake
Produces:
from semolina import SemanticView, Metric, Dimension, Fact
class SalesView(SemanticView, view="analytics.sales_view"):
revenue = Metric[int]()
country = Dimension[str]()
unit_price = Fact[float]()
Given this metric view in your warehouse:
CREATE OR REPLACE VIEW main.analytics.orders_view
WITH METRICS
LANGUAGE YAML
AS $$
version: 1.1
source: source_table
dimensions:
- name: region
expr: region
measures:
- name: total_orders
expr: COUNT(*)
$$;
Running:
semolina codegen main.analytics.orders_view --backend databricks
Produces:
from semolina import SemanticView, Metric, Dimension, Fact
class OrdersView(
SemanticView, view="main.analytics.orders_view"
):
total_orders = Metric[int]()
region = Dimension[str]()
Databricks has no native Fact type, so all non-measure fields map to Dimension().
Understand field type mapping¶
Warehouse classification |
Generated field type |
|---|---|
Metric / Measure |
|
Dimension |
|
Fact (Snowflake only) |
|
Handle TODO comments¶
When a field’s SQL type has no clean Python equivalent (GEOGRAPHY, VARIANT, ARRAY, MAP, STRUCT), codegen emits a TODO comment rather than guessing:
# TODO: no clean Python type for GEOGRAPHY field "territory"
territory = Dimension()
Review these after generation and handle them manually.
Exit codes¶
semolina codegen uses distinct exit codes so scripts can handle each failure mode separately:
Exit code |
Meaning |
|---|---|
|
Success – model class written to stdout |
|
Unexpected error (see stderr for details) |
|
Invalid |
|
View not found – the warehouse has no semantic view with that name |
|
Connection failure – credentials missing or authentication rejected |
Tip
Exit code 2 is also emitted by the CLI argument parser when --backend is
omitted entirely. Both cases mean “the backend could not be resolved.”
Override the SQL column name with source=¶
By default, Semolina maps Python field names to SQL column names using each dialect’s
identifier casing rules (Snowflake uppercases unquoted identifiers; Databricks lowercases them).
For a field order_id, Snowflake resolves ORDER_ID automatically.
If your warehouse stores a column with non-default casing, for example a quoted
lowercase column "order_id" in Snowflake, you can override the SQL column name
with source=:
class Orders(SemanticView, view="orders"):
order_id = Metric[int](
source="order_id"
) # maps to quoted "order_id", not "ORDER_ID"
semolina codegen emits source= automatically when introspection detects that a column
uses non-default casing.
See also¶
How to configure codegen credentials – environment variables, .env files, and config file fallback
How to define models – model class structure and field types
How to connect to Snowflake – Snowflake pool configuration
How to connect to Databricks – Databricks pool configuration