Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering

We present the design of Espresso, Google’s SDN-based Internet
peering edge routing infrastructure. This architecture grew out of a
need to exponentially scale the Internet edge cost-effectively and to
enable application-aware routing at Internet-peering scale. Espresso
utilizes commodity switches and host-based routing/packet process-
ing to implement a novel fine-grained traffic engineering capability.
Overall, Espresso provides Google a scalable peering edge that is
programmable, reliable, and integrated with global traffic systems.
Espresso also greatly accelerated deployment of new networking
features at our peering edge. Espresso has been in production for
two years and serves over 22% of Google’s total traffic to the Inter-
net.

Source: http://delivery.acm.org/10.1145/3100000/3098854/p432-Yap.pdf?ip=71.127.43.118&id=3098854&acc=OA&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E5945DC2EABF3343C&CFID=822652791&CFTOKEN=84906944&__acm__=1508966541_3307496633bb16bac9c9b6dbf3cd6d11

Advertisements

Espresso makes Google cloud faster, more available and cost effective by extending SDN to the public internet

Today, we gave a keynote presentation at the Open Networking Summit, where we shared details about Espresso, Google’s peering edge architecture—the latest offering in our Software Defined Networking (SDN) strategy. Espresso has been in production for over two years and routes 20 percent of our total traffic to the internet—and growing. It’s changing the way traffic is directed at the peering edge, delivering unprecedented scale, flexibility and efficiency.

Source: https://www.blog.google/topics/google-cloud/making-google-cloud-faster-more-available-and-cost-effective-extending-sdn-public-internet-espresso/

Cutting the Cord: a Robust Wireless Facilities Network for Data Centers

Today’s network control and management traffic are limited by
their reliance on existing data networks. Fate sharing in this context
is highly undesirable, since control traffic has very different availability
and traffic delivery requirements. In this paper, we explore
the feasibility of building a dedicated wireless facilities network for
data centers. We propose Angora, a low-latency facilities network
using low-cost, 60GHz beamforming radios that provides robust
paths decoupled from the wired network, and flexibility to adapt to
workloads and network dynamics. We describe our solutions to address
challenges in link coordination, link interference and network
failures. Our testbed measurements and simulation results show
that Angora enables large number of low-latency control paths to
run concurrently, while providing low latency end-to-end message
delivery with high tolerance for radio and rack failures.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43860.pdf

Lessons learned from B4, Google’s SDN WAN

Google’s B4 wide area network was first revealed several years ago. The outside observer might have thought, “Google’s B4 is finished. I wonder what they’re going to do next.” Turns out, once any network is in production @scale, there’s a continued need to make it better. Subhasree Mandal covered the reality of how Google iterated multiple times on different parts of B4 to improve its performance, availability, and scalability. Several of the challenges and solutions that Subhasree detailed were definitely at the intersection of networking and distributed systems. B4 was covered in a SIGCOMM 2013 paper from Google.

Onix: A Distributed Control Platform for Large-scale Production Networks

Computer networks lack a general control paradigm,
as traditional networks do not provide any networkwide
management abstractions. As a result, each new
function (such as routing) must provide its own state
distribution, element discovery, and failure recovery
mechanisms. We believe this lack of a common control
platform has significantly hindered the development of
flexible, reliable and feature-rich network control planes.
To address this, we present Onix, a platform on top of
which a network control plane can be implemented as a
distributed system. Control planes written within Onix
operate on a global view of the network, and use basic
state distribution primitives provided by the platform.
Thus Onix provides a general API for control plane
implementations, while allowing them to make their own
trade-offs among consistency, durability, and scalability.

Source: https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Koponen.pdf

FAUCET: Deploying SDN in the Enterprise

Since the publication of OpenFlow: Enabling Innovation in Campus Networks in 2008, there has been a lot of published work and experience with SDN and OpenFlow in large networks and in datacenters, including at Google. In this article we will discuss an open source SDN controller, FAUCET. FAUCET was created to bring the benefits of SDN to a typical enterprise network and has been deployed in various settings, including the Open Networking Foundation, which runs an instance of FAUCET as their office network. FAUCET delivers high forwarding performance using switch hardware, while enabling operators to add features to their networks and deploy them quickly, in many cases without needing to change (or even reboot) hardware – and interoperates with neighboring non-SDN network devices.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45641.pdf