SREcon17 Asia/Australia: SRE Your gRPC—Building Reliable Distributed Systems Illustrated with gRPC
Grainne Sheerin and Gabe Krabbe, Google
Distributed systems have sharp edges, and we have a wealth of experience cutting ourselves on them. We want to share our experience with SREs elsewhere, so they can skip making the same mistakes and join us making exciting new ones instead!
We will share practical suggestions from 14 years of failing gracefully:
– In a distributed service, every component is a frontend to another one down the stack. How can it deal with backend failures so that the service as a whole does not go down?
– In a distributed service, every component is a backend for another one up the stack. How can it be scaled and managed, avoiding overload and under-use?
– In a distributed service, latency is often the biggest uncertainty. How can it be kept predictable?
– In a distributed service, availability, processing, and latency costs contributions are hard to assign. When things (inevitably) go wrong, what components are to blame? When they work, where are the biggest opportunities for improvement?
We will cover best and worst practices, using specific gRPC examples for illustration.
Sign up to find out more about SREcon at https://srecon.usenix.org
via YouTube https://youtu.be/eoy9z0UlaII
Darren Bilby: A Decade of Lessons in Incident Response
A 10-year veteran at Google, Bilby was the tech lead for Google’s Global Incident Response Team for six years, managed Google’s European detection team in Zürich for two years and has also worked as a software engineer building out Google’s security tools. He was also the founder and a core developer of the open source GRR Incident Response project.
During his lecture, Bilby discussed the key lessons he learned in incident response at Google over the past 10 years, particularly those that were learned the hard way and what other security experts can take from them.
Read the full summary at https://www.first.org/blog/20170613-DarrenBilby_keynote
via YouTube https://youtu.be/6qssVEHrpWo
If you’re familiar with the articles about Google’s BeyondCorp network
security model published in ;login: [1-3] over the past two years, you
may be thinking, “That all sounds good, but how does my organization
move from where we are today to a similar model? What do I need to do?
And what’s the potential impact on my company and my employees?” This
article discusses how we moved from our legacy network to the BeyondCorp model—changing the fundamentals of network access—without reducing the company’s productivity.
from mikecb’s Activity on Github: https://github.com/ksonnet/kubecfg