The software development process is quite complex and involves a number of independent actors. Developers check source code into a version control system, the code is compiled into software at a build farm, and CI/CD systems run multiple tests to ensure the software’s quality among a myriad of other operations. Finally, the software is packaged for distribution into a delivered product, to be consumed by end users. An attacker that is able to compromise any single step in the process can maliciously modify the software and harm any of the software’s users.
To address these issues, we designed in-toto, a framework that cryptographically ensures the integrity of the software supply chain. in-toto grants the end user the ability to verify the software’s supply chain from the project’s inception to its deployment. We demonstrate in-toto’s effectiveness on 30 software supply chain compromises that affected hundreds of million of users and showcase in-toto’s usage over cloud-native, hybrid-cloud and cloud-agnostic applications. in-toto is integrated into products and open source projects that are used by millions of people daily.
A (build) target specifies the information that is
needed to automatically build a software artifact. This paper
focuses on underutilized targets—an important dependency problem
that we identified at Google. An underutilized target is one
with files not needed by some of its dependents. Underutilized
targets result in less modular code, overly large artifacts, slow
builds, and unnecessary build and test triggers. To mitigate these
problems, programmers decompose underutilized targets into
smaller targets. However, manually decomposing a target is
tedious and error-prone. Although we prove that finding the best
target decomposition is NP-hard, we introduce a greedy algorithm
that proposes a decomposition through iterative unification of the
strongly connected components of the target. Our tool found
that 19,994 of 40,000 Java library targets at Google can be
decomposed to at least two targets. The results show that our
tool is (1) efficient because it analyzes a target in two minutes
on average and (2) effective because for each of 1,010 targets, it
would save at least 50% of the total execution time of the tests
triggered by the target.