DevOps and containerization at scale

Since the split from eBay in 2015 PayPal has turbo charged DevOps. A key example of this in action is adopting Docker as containerization technology to enhance our developer experience, reduce drift in different environments like test and production and to bring higher efficiency of resource utilization in our data centers. This session will discuss PayPal’s journey to docker-ize 2,500 apps and hundreds of thousands of container instances.

Making “Push On Green” a Reality: Issues & Actions Involved in Maintaining a Production Service

Updating production software is a process that may require dozens, if not hundreds, of steps. These include creating and testing new code, building new binaries and packages, associating the packages with a versioned release, updating the jobs in production datacenters, possibly modifying database schemata, and testing and verifying the results. There are boxes to check and approvals to seek, and the more automated the process, the easier it becomes. When releases can be made faster, it is possible to release more often, and, organizationally, one becomes less afraid to “release early, release often” [6, 7]. And that’s what we describe in this article—making rollouts as easy and as automated as possible. When a “green” condition is detected, we can more quickly perform a new rollout. Humans are still needed somewhere in the loop, but we strive to reduce the purely mechanical toil they need to perform.

Source: https://www.usenix.org/system/files/login/issues/login_1410_online.pdf

Borg, Omega, and Kubernetes – Lessons learned from three containermanagement systems over a decade

Though widespread interest in software containers is a relatively recent phenomenon, at Google we have been managing Linux containers at scale for more than ten years and built three different container management systems in that time. Each system was heavily influenced by its predecessors, even though they were developed for different reasons. This article describes the lessons we’ve learned from developing and operating them.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44843.pdf

Design patterns for container-based distributed systems

In the late 1980s and early 1990s, object-oriented programming
revolutionized software development, popularizing
the approach of building of applications as collections
of modular components. Today we are seeing
a similar revolution in distributed system development,
with the increasing popularity of microservice architectures
built from containerized software components.
Containers [15] [22] [1] [2] are particularly well-suited
as the fundamental “object” in distributed systems by
virtue of the walls they erect at the container boundary.
As this architectural style matures, we are seeing the
emergence of design patterns, much as we did for objectoriented
programs, and for the same reason – thinking in
terms of objects (or containers) abstracts away the lowlevel
details of code, eventually revealing higher-level
patterns that are common to a variety of applications and
algorithms.
This paper describes three types of design patterns
that we have observed emerging in container-based distributed
systems: single-container patterns for container
management, single-node patterns of closely cooperating
containers, and multi-node patterns for distributed
algorithms. Like object-oriented patterns before them,
these patterns for distributed computation encode best
practices, simplify development, and make the systems
where they are used more reliable.

Source: https://www.usenix.org/system/files/conference/hotcloud16/hotcloud16_burns.pdf

Continuous Pipelines at Google

This article focuses on the real life challenges of managing data processing pipelines of depth and complexity. It considers the frequency continuum between periodic pipelines that run very infrequently through continuous pipelines that never stop running, and discusses the discontinuities that can produce significant operational problems. A fresh take on the master-­slave model is presented as a more reliable and better scaling alternative to the periodic pipeline for processing Big Data.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43790.pdf