A pair weeks in the past, dbt Labs made an enormous splash at their yearly convention by saying the brand new dbt Semantic Layer. This was an enormous deal, spawning excited tweets, in-depth thinkpieces, and celebration from companions like us.
The time period “semantic layer” (often known as a “metrics layer”) has been round for many years. dbt didn’t invent the idea, nor the phrase, although their model is actually value listening to.
“However Austin, what’s a semantic layer then?” So glad you requested.
On this article, I’ll break down what a semantic layer is in easy phrases and why you must care about dbt’s Semantic Layer.
What’s a semantic layer, and the place did it come from?
Semantic layer is a really literal time period – it’s the “layer” in a knowledge structure that makes use of “semantics” (phrases) that the enterprise consumer will really perceive. Generally it’s known as the “enterprise layer” or the “metrics layer”.
As an alternative of uncooked tables with column names like “A000_CUST_ID_PROD”, information groups construct a semantic layer and rename that column “Buyer”. Semantic layers assist to cover complicated code from enterprise customers. This code can get fairly complicated as information groups attempt to seize the enterprise logic for key metrics, dimensions, and schemas.
So the place did this concept come from? Again within the day (I’m speaking in regards to the ‘90s and early 2000s), we had fairly fundamental information tech. It was very sluggish and really laborious to make use of if you happen to didn’t have a deep IT background.
Massive firms like IBM, SAP, and Oracle constructed Enterprise Intelligence (BI) instruments like Cognos, Enterprise Objects, and Oracle BI, which might take smaller chunks of information from a clunky information warehouse and let IT folks construct these semantic layers for enterprise customers. Basically, they had been extra human-readable information layers for enterprise customers.
The problem with early semantic layers
This business-friendly layer seems like a “good to have” enchancment, but it surely was actually a necessity as a result of attempting to run even a fundamental report throughout a complete information warehouse might take hours and even days. (Sure, days.)
Enter the primary drawback: old-school semantic layers took wayyyyy too lengthy to construct, since folks trusted IT to arrange and modify them. To make issues worse, they had been cumbersome to take care of since enterprise wants had been all the time altering.
The enterprise customers’ resolution… export to Excel!
Enter fancy new BI instruments like Tableau, Qlik, and Energy BI. The speculation was that if we empower the enterprise customers to “self-serve” with low-code or no-code BI instruments, the IT bottleneck will go away and analytics will formally be democratized! At the very least, that was the concept.
Enter the second drawback: we deserted the semantic layer idea for years, in favor of agility.
Not like outdated IT instruments, extra personas might purchase and use these new BI instruments. As an alternative of 1 BI device utilizing 1 semantic layer, constructed by 1 group from 1 information warehouse, we had a number of BI instruments, being utilized by all types of groups with no actual semantic layer.
Simply image this situation, which in all probability appears all too actual to most information folks. I convey my Tableau dashboard to a gathering, another person brings their Excel workbook, and another person brings a Energy BI dashboard. All of us then present totally different numbers for “complete income final quarter”. Uh oh!
After years of alternately ignoring and chasing the self-service BI dream, this matter blew up within the information world once more. (We even flagged this as one of many six large concepts from 2022 in our Way forward for the Fashionable Information Stack report.)
This began in January, when Base Case proposed “Headless Enterprise Intelligence”, a brand new method to fixing issues with enterprise metrics and phrases. A pair months later, Benn Stancil talked in regards to the “lacking metrics layer” in immediately’s information stack.
That’s when issues actually took off. Airbnb introduced that it had been constructing a home-grown metrics platform known as Minerva to resolve this situation. Different outstanding tech firms quickly adopted go well with, together with LinkedIn, Uber, and Spotify. Then dbt opened a PR hinting at a metrics or semantics layer, which included hyperlinks to these foundational blogs by Benn and Base Case.
This was such a sizzling matter that one among our Nice Information Debates was all in regards to the metrics layer, with a fiery dialogue between Drew Banin from dbt Labs and Nick Handel from Rework.
The end result has been an enormous open query within the information and analytics world — how can we convey again all the good issues that IT liked about semantic layers (consistency, clear governance, and trusted dependable information) with out compromising the agility that analysts and enterprise customers demand?
Now lower than two years after this debate kicked off, evidently the way forward for the semantic layer has lastly turn out to be a actuality.
The dbt Semantic Layer
Enter dbt Labs and its new Semantic Layer!
The dbt Semantic Layer is the interface between your information and your analyses: A platform for compiling and accessing dbt belongings in downstream instruments.
Information practitioners can outline metrics of their dbt tasks, then information customers can question constantly outlined metrics in downstream instruments.
Cameron Afzal, Product Supervisor for the dbt Semantic Layer
The core idea behind dbt’s Semantic Layer is: outline issues as soon as, use them anyplace.
Why does that make folks completely happy? This brings the idea of a semantic layer and its common metrics into dbt’s transformation layer. As dbt Labs put it, “Information practitioners can outline metrics of their dbt tasks, then information customers can question constantly outlined metrics in downstream instruments.”
Information groups can construct these fashions and metrics in dbt, after which tie them into their different developer instruments like model management and launch administration with the Semantic Layer.
No matter what BI device they use, analysts and enterprise customers can then seize information and go into that assembly, assured that their reply would be the identical as a result of they pulled the metric from a centralized place.
Be taught extra and get began with the dbt Semantic Layer right here.
dbt + Atlan
The dbt Semantic Layer is nice in its personal proper, however what makes it much more thrilling is the way it ties in with key instruments throughout the trendy information stack… and we’re one among them!
Alongside the dbt keynote, we introduced our partnership with dbt Labs and our integration with the Semantic Layer. With this, joint prospects may have entry to an end-to-end governance framework for information fashions and metrics within the fashionable information stack.
The dbt Semantic Layer created a normal technique to outline metrics throughout your transformations and fashions. Now our integration brings these wealthy metrics into the remainder of the information stack.
With this integration, dbt metrics and fashions are first-class belongings in Atlan. Because of this they’re searchable and discoverable by way of our platform and a part of auto-generated, column-level lineage, identical to any Snowflake desk, Fivetran pipeline, or Looker dashboard.
Our native dbt Cloud integration ingests all dbt metrics and metadata about dbt fashions, merges it with metadata from all different instruments within the information stack, creates column-level lineage from supply to BI, and sends that unified context again into instruments like Snowflake and the BI instruments the place folks work day by day.
With highly effective impression and root trigger evaluation, fashionable information groups lastly have the instruments they want for end-to-end information governance and alter administration at each stage of the information lifecycle.
Be taught extra about our integration with the dbt Semantic Layer right here.