Zanzibar: Google’s Consistent, Global Authorization System

Determining whether online users are authorized to access
digital objects is central to preserving privacy. This paper presents the design, implementation, and deployment
of Zanzibar, a global system for storing and evaluating access control lists. Zanzibar provides a uniform data model
and configuration language for expressing a wide range of
access control policies from hundreds of client services at
Google, including Calendar, Cloud, Drive, Maps, Photos,
and YouTube. Its authorization decisions respect causal ordering of user actions and thus provide external consistency
amid changes to access control lists and object contents.
Zanzibar scales to trillions of access control lists and millions
of authorization requests per second to support services used
by billions of people. It has maintained 95th-percentile latency of less than 10 milliseconds and availability of greater
than 99.999% over 3 years of production use

Source: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/0749e1e54ded70f54e1f646cd440a5a523c69164.pdf

PROCHLO: Strong Privacy for Analytics in the Crowd

The large-scale monitoring of computer users’ software
activities has become commonplace, e.g., for application
telemetry, error reporting, or demographic profiling. This
paper describes a principled systems architecture—Encode,
Shuffle, Analyze (ESA)—for performing such monitoring
with high utility while also protecting user privacy. The ESA
design, and its PROCHLO implementation, are informed by
our practical experiences with an existing, large deployment
of privacy-preserving software monitoring.
With ESA, the privacy of monitored users’ data is guaranteed
by its processing in a three-step pipeline. First, the data
is encoded to control scope, granularity, and randomness.
Second, the encoded data is collected in batches subject to
a randomized threshold, and blindly shuffled, to break linkability
and to ensure that individual data items get “lost in the
crowd” of the batch. Third, the anonymous, shuffled data is
analyzed by a specific analysis engine that further prevents
statistical inference attacks on analysis results.
ESA extends existing best-practice methods for sensitivedata
analytics, by using cryptography and statistical techniques
to make explicit how data is elided and reduced in
precision, how only common-enough, anonymous data is analyzed,
and how this is done for only specific, permitted purposes.
As a result, ESA remains compatible with the established
workflows of traditional database analysis.
Strong privacy guarantees, including differential privacy,
can be established at each processing step to defend
against malice or compromise at one or more of those steps.
PROCHLO develops new techniques to harden those steps,
including the Stash Shuffle, a novel scalable and efficient
oblivious-shuffling algorithm based on Intel’s SGX, and new
applications of cryptographic secret sharing and blinding.
We describe ESA and PROCHLO, as well as experiments
that validate their ability to balance utility and privacy.

Source: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/9161f08531933ca6e401834898b4e33362c55430.pdf

Ubiq: A Scalable and Fault-tolerant Log Processing Infrastructure

Abstract. Most of today’s Internet applications generate vast amounts
of data (typically, in the form of event logs) that needs to be processed
and analyzed for detailed reporting, enhancing user experience and increasing
monetization. In this paper, we describe the architecture of
Ubiq, a geographically distributed framework for processing continuously
growing log files in real time with high scalability, high availability and
low latency. The Ubiq framework fully tolerates infrastructure degradation
and data center-level outages without any manual intervention. It
also guarantees exactly-once semantics for application pipelines to process
logs as a collection of multiple events. Ubiq has been in production
for Google’s advertising system for many years and has served as a critical
log processing framework for several dozen pipelines. Our production
deployment demonstrates linear scalability with machine resources, extremely
high availability even with underlying infrastructure failures, and
an end-to-end latency of under a minute.

Source: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45805.pdf

Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization

This paper presents our design and experience with Andromeda,
Google Cloud Platform’s network virtualization
stack. Our production deployment poses several challenging
requirements, including performance isolation among
customer virtual networks, scalability, rapid provisioning
of large numbers of virtual hosts, bandwidth and latency
largely indistinguishable from the underlying hardware,
and high feature velocity combined with high availability.
Andromeda is designed around a flexible hierarchy of
flow processing paths. Flows are mapped to a programming
path dynamically based on feature and performance
requirements. We introduce the Hoverboard programming
model, which uses gateways for the long tail of low bandwidth
flows, and enables the control plane to program
network connectivity for tens of thousands of VMs in
seconds. The on-host dataplane is based around a highperformance
OS bypass software packet processing path.
CPU-intensive per packet operations with higher latency
targets are executed on coprocessor threads. This architecture
allows Andromeda to decouple feature growth from
fast path performance, as many features can be implemented
solely on the coprocessor path. We demonstrate
that the Andromeda datapath achieves performance that is
competitive with hardware while maintaining the flexibility
and velocity of a software-based architecture.

Source: https://www.usenix.org/system/files/conference/nsdi18/nsdi18-dalton.pdf

Beyond Corp 1 – A New Approach to Enterprise Security

Virtually every company today uses firewalls to enforce perimeter
security. However, this security model is problematic because, when
that perimeter is breached, an attacker has relatively easy access to a
company’s privileged intranet. As companies adopt mobile and cloud technologies,
the perimeter is becoming increasingly difficult to enforce. Google
is taking a different approach to network security. We are removing the
requirement for a privileged intranet and moving our corporate applications
to the Internet.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43231.pdf