Provide chain safety is on the fore of the trade’s collective consciousness. We’ve just lately seen a big rise in software program provide chain assaults, a Log4j vulnerability of catastrophic severity and breadth, and even an Govt Order on Cybersecurity.
It’s in opposition to this background that Google is looking for contributors to a brand new open supply undertaking referred to as GUAC (pronounced just like the dip). GUAC, or Graph for Understanding Artifact Composition, is within the early phases but is poised to alter how the trade understands software program provide chains. GUAC addresses a necessity created by the burgeoning efforts throughout the ecosystem to generate software program construct, safety, and dependency metadata. True to Google’s mission to prepare and make the world’s info universally accessible and helpful, GUAC is supposed to democratize the supply of this safety info by making it freely accessible and helpful for each group, not simply these with enterprise-scale safety and IT funding.
Due to group collaboration in teams comparable to OpenSSF, SLSA, SPDX, CycloneDX, and others, organizations more and more have prepared entry to:
These knowledge are helpful on their very own, nevertheless it’s troublesome to mix and synthesize the knowledge for a extra complete view. The paperwork are scattered throughout totally different databases and producers, are connected to totally different ecosystem entities, and can’t be simply aggregated to reply higher-level questions on a corporation’s software program property.
To assist deal with this problem we’ve teamed up with Kusari, Purdue College, and Citi to create GUAC, a free device to deliver collectively many alternative sources of software program safety metadata. We’re excited to share the undertaking’s proof of idea, which helps you to question a small dataset of software program metadata together with SLSA provenance, SBOMs, and OpenSSF Scorecards.
Graph for Understanding Artifact Composition (GUAC) aggregates software program safety metadata right into a excessive constancy graph database—normalizing entity identities and mapping normal relationships between them. Querying this graph can drive higher-level organizational outcomes comparable to audit, coverage, threat administration, and even developer help.
Conceptually, GUAC occupies the “aggregation and synthesis” layer of the software program provide chain transparency logical mannequin:
GUAC has 4 main areas of performance:
- Assortment
GUAC could be configured to connect with quite a lot of sources of software program safety metadata. Some sources could also be open and public (e.g., OSV); some could also be first-party (e.g., a corporation’s inside repositories); some could also be proprietary third-party (e.g., from knowledge distributors). - Ingestion
From its upstream knowledge sources GUAC imports knowledge on artifacts, initiatives, assets, vulnerabilities, repositories, and even builders. - Collation
Having ingested uncooked metadata from disparate upstream sources, GUAC assembles it right into a coherent graph by normalizing entity identifiers, traversing the dependency tree, and reifying implicit entity relationships, e.g., undertaking → developer; vulnerability → software program model; artifact → supply repo, and so forth. - Question
In opposition to an assembled graph one could question for metadata connected to, or associated to, entities throughout the graph. Querying for a given artifact could return its SBOM, provenance, construct chain, undertaking scorecard, vulnerabilities, and up to date lifecycle occasions — and people for its transitive dependencies.A CISO or compliance officer in a corporation desires to have the ability to cause concerning the threat of their group. An open supply group just like the Open Supply Safety Basis desires to establish essential libraries to keep up and safe. Builders want richer and extra reliable intelligence concerning the dependencies of their initiatives.
The excellent news is, more and more one finds the upstream provide chain already enriched with attestations and metadata to energy higher-level reasoning and insights. The dangerous information is that it’s troublesome or not possible at the moment for software program shoppers, operators, and directors to assemble this knowledge right into a unified view throughout their software program property.
To know one thing advanced just like the blast radius of a vulnerability, one must hint the connection between a element and all the pieces else within the portfolio—a process that might span hundreds of metadata paperwork throughout lots of of sources. Within the open supply ecosystem, the variety of paperwork might attain into the tens of millions.
GUAC aggregates and synthesizes software program safety metadata at scale and makes it significant and actionable. With GUAC in hand, we will reply questions at three necessary phases of software program provide chain safety:
- Proactive, e.g.,
- What are probably the most used essential elements in my software program provide chain ecosystem?
- The place are the weak factors in my total safety posture?
- How do I stop provide chain compromises earlier than they occur?
- The place am I uncovered to dangerous dependencies?
- Operational, e.g.,
- Is there proof that the appliance I’m about to deploy meets group coverage?
- Do all binaries in manufacturing hint again to a securely managed repository?
- Reactive, e.g.,
- Which components of my group’s stock is affected by new vulnerability X?
- A suspicious undertaking lifecycle occasion has occurred. The place is threat launched to my group?
- An open supply undertaking is being deprecated. How am I affected?
- Proactive, e.g.,
GUAC is an Open Supply undertaking on Github, and we’re excited to get extra people concerned and contributing (learn the contributor information to get began)! The undertaking remains to be in its early phases, with a proof of idea that may ingest SLSA, SBOM, and Scorecard paperwork and assist easy queries and exploration of software program metadata. The following efforts will deal with scaling the present capabilities and including new doc sorts for ingestion. We welcome assist and contributions of code or documentation.
Because the undertaking will probably be consuming paperwork from many alternative sources and codecs, now we have put collectively a gaggle of “Technical Advisory Members” to assist advise the undertaking. These members embody illustration from corporations and teams comparable to SPDX, CycloneDX Anchore, Aquasec, IBM, Intel, and many extra. In case you’re desirous about collaborating as a contributor or advisor representing finish customers’ wants—or the sources of metadata GUAC consumes—you’ll be able to register your curiosity within the related GitHub problem.
The GUAC crew will probably be showcasing the undertaking at Kubecon NA 2022 subsequent week. Come by our session for those who’ll be there and have a chat with us—we’d be pleased to speak in particular person or just about!