Omega: flexible, scalable schedulers for large compute clusters

Increasing scale and the need for rapid response to changing
requirements are hard to meet with current monolithic cluster
scheduler architectures. This restricts the rate at which
new features can be deployed, decreases efficiency and utilization,
and will eventually limit cluster growth. We present
a novel approach to address these needs using parallelism,
shared state, and lock-free optimistic concurrency control.
We compare this approach to existing cluster scheduler
designs, evaluate how much interference between schedulers
occurs and how much it matters in practice, present some
techniques to alleviate it, and finally discuss a use case
highlighting the advantages of our approach – all driven by
real-life Google production workloads.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s