Why Google Stores Billions of Lines of Code in a Single Repository | July 2016 | Communications of the ACM

Really cool overview of the tools they use to keep 2 billion loc up to date with a ridiculous churn rate – all made possible by usable tooling and autonomous systems.

Source: Why Google Stores Billions of Lines of Code in a Single Repository | July 2016 | Communications of the ACM

Key Points:

Monolithic repo: all devs can see all code, not have to worry about setting arbitrary api and library boundaries, globally refactor, and take responsibility for the impact their changes have on the entire repo. This repo is backed by spanner and replicated across 10 datacenters for excellent availability.

Workspaces: Developers access the repo through cloud based storage exposed through a FUSE filesystem, making access to the gigantic repo transparent to the end-user even on a bad connection. The workspace is snapshotted, so devs can go back and forth between versions even between commits.

Branches: Develop against HEAD, old/new codepaths are specified through flags, and old are then deleted.

Review: Automated global and directory specific test automation, mandatory human review by directory owners. Automated bots to perform large scale find/replace or more significant refactoring across the entire codebase.

Required extensive investment in tooling, including distributed build system (blaze) test framework, automated refactoring systems (Rosie), static analysis (tricorder), code review (Critique), and code indexing and search that powers all of the above (codesearch). Some of these are available on github:

Code indexing https://github.com/google/kythe

Static analysis https://github.com/google/shipshape

Build system https://github.com/bazelbuild/bazel

Code search: https://github.com/google/zoekt

References

1. Bloch, D. Still All on One Server: Perforce at Scale. Google White Paper, 2011; http://info.perforce.com/rs/perforce/images/GoogleWhitePaper-StillAllonOneServer-PerforceatScale.pdf

2. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R.E. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems 26, 2 (June 2008).

3. Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P. et al. Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems 31, 3 (Aug. 2013).

4. Gabriel, R.P., Northrop, L., Schmidt, D.C., and Sullivan, K. Ultra-large-scale systems. In Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications (Portland, OR, Oct. 22-26). ACM Press, New York, 2006, 632–634.

5. Kemper, C. Build in the Cloud: How the Build System works. Google Engineering Tools blog post, 2011; http://google-engtools.blogspot.com/2011/08/build-in-cloud-how-build-system-works.html

6. Lamport, L. Paxos made simple. ACM Sigact News 32, 4 (Nov. 2001), 18–25.

7. Morgenthaler, J.D., Gridnev, M., Sauciuc, R., and Bhansali, S. Searching for build debt: Experiences managing technical debt at Google. In Proceedings of the Third International Workshop on Managing Technical Debt (Zürich, Switzerland, June 2-9). IEEE Press Piscataway, NJ, 2012, 1–6.

8. Ren, G., Tune, E., Moseley, T., Shi, Y., Rus, S., and Hundt, R. Google-wide profiling: A continuous profiling infrastructure for data centers. IEEE Micro 30, 4 (2010), 65–79.

9. Sadowski, C., Stolee, K., and Elbaum, S. How developers search for code: A case study. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy, Aug. 30-Sept. 4). ACM Press, New York, 2015, 191–201.

10. Sadowski, C., van Gogh, J., Jaspan, C., Soederberg, E., and Winter, C. Tricorder: Building a program analysis ecosystem. In Proceedings of the 37th International Conference on Software Engineering, Vol. 1 (Firenze, Italy, May 16-24). IEEE Press Piscataway, NJ, 2015, 598–608.

11. Wasserman, L. Scalable, example-based refactorings with Refaster. In Proceedings of the 2013 ACM Workshop on Refactoring Tools (Indianapolis, IN, Oct. 26-31). ACM Press, New York, 2013, 25–28.

12. Wikipedia. Dependency hell. Accessed Jan. 20, 2015; http://en.wikipedia.org/w/index.php?title=Dependency_hell&oldid=634636715

13. Wikipedia. Filesystem in userspace. Accessed June, 4, 2015; http://en.wikipedia.org/w/index.php?title=Filesystem_in_Userspace&oldid=664776514

14. Wikipedia. Linux kernel. Accessed Jan. 20, 2015; http://en.wikipedia.org/w/index.php?title=Linux_kernel&oldid=643170399

15. Wright, H.K., Jasper, D., Klimek, M., Carruth, C., and Wan, Z. Large-scale automated refactoring using ClangMR. In Proceedings of the IEEE International Conference on Software Maintenance (Eindhoven, The Netherlands, Sept. 22-28). IEEE Press, 2013, 548–551.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s