Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions

We analyze how modern distributed storage systems behave
in the presence of file-system faults such as data
corruption and read and write errors. We characterize
eight popular distributed storage systems and uncover
numerous bugs related to file-system fault tolerance. We
find that modern distributed systems do not consistently
use redundancy to recover from file-system faults: a
single file-system fault can cause catastrophic outcomes
such as data loss, corruption, and unavailability. Our results
have implications for the design of next generation
fault-tolerant distributed and cloud storage systems.

Source: https://www.usenix.org/system/files/conference/fast17/fast17-ganesan.pdf

Advertisements

Flexible Network Bandwidth and Latency Provisioning in the Datacenter

Abstract
Predictably sharing the network is critical to achieving
high utilization in the datacenter. Past work has focussed
on providing bandwidth to endpoints, but often
we want to allocate resources among multi-node services.
In this paper, we present Parley, which provides
service-centric minimum bandwidth guarantees, which
can be composed hierarchically. Parley also supports
service-centric weighted sharing of bandwidth in excess
of these guarantees. Further, we show how to configure
these policies so services can get low latencies even at
high network load. We evaluate Parley on a multi-tiered
oversubscribed network connecting 90 machines, each
with a 10Gb/s network interface, and demonstrate that
Parley is able to meet its goals.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43871.pdf

Trumpet: Timely and Precise Triggers in Data Centers

As data centers grow larger and strive to provide tight performance
and availability SLAs, their monitoring infrastructure
must move from passive systems that provide aggregated
inputs to human operators, to active systems that enable programmed
control. In this paper, we propose Trumpet, an
event monitoring system that leverages CPU resources and
end-host programmability, to monitor every packet and report
events at millisecond timescales. Trumpet users can express
many network-wide events, and the system efficiently detects
these events using triggers at end-hosts. Using careful design,
Trumpet can evaluate triggers by inspecting every packet at
full line rate even on future generations of NICs, scale to
thousands of triggers per end-host while bounding packet
processing delay to a few microseconds, and report events
to a controller within 10 milliseconds, even in the presence
of attacks. We demonstrate these properties using an implementation
of Trumpet, and also show that it allows operators
to describe new network events such as detecting correlated
bursts and loss, identifying the root cause of transient congestion,
and detecting short-term anomalies at the scale of a data
center tenant.

Source: http://www.cs.yale.edu/homes/yu-minlan/writeup/sigcomm16.pdf

Incremental, Iterative Data Processing with Timely Dataflow

We describe the timely dataflow model for distributed computation and its implementation in the Naiad system. The model supports stateful iterative and incremental computations. It enables both low-latency stream processing and high-throughput batch processing, using a new approach to coordination that combines asynchronous and fine-grained synchronous execution. We describe two of the programming frameworks built on Naiad: GraphLINQ for parallel graph processing, and differential dataflow for nested iterative and incremental computations. We show that a general-purpose system can achieve performance that matches, and sometimes exceeds, that of specialized systems.

Source: http://delivery.acm.org/10.1145/2990000/2983551/p75-murray.pdf?ip=73.135.106.24&id=2983551&acc=OA&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E5945DC2EABF3343C&CFID=887680414&CFTOKEN=71994990&__acm__=1484169807_1cb0e26f9cab8e683902618d19f62d56

Federated Optimization: Distributed Machine Learning for On-Device Intelligence

We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizing the number of rounds of communication is the principal goal. A motivating example arises when we keep the training data locally on users’ mobile devices instead of logging it to a data center for training. In federated optimization, the devices are used as compute nodes performing computation on their local data in order to update a global model. We suppose that we have extremely large number of devices in the network — as many as the number of users of a given service, each of which has only a tiny fraction of the total data available. In particular, we expect the number of data points available locally to be much smaller than the number of devices. Additionally, since different users generate data with different patterns, it is reasonable to assume that no device has a representative sample of the overall distribution. We show that existing algorithms are not suitable for this setting, and propose a new algorithm which shows encouraging experimental results for sparse convex problems. This work also sets a path for future research needed in the context of federated optimization.

Source: https://arxiv.org/pdf/1610.02527v1.pdf

Distributed Authorization in Vanadium

Abstract. In this tutorial, we present an authorization model for distributed
systems that operate with limited internet connectivity. Reliable
internet access remains a luxury for a majority of the world’s population.
Even for those who can afford it, a dependence on internet connectivity
may lead to sub-optimal user experiences. With a focus on decentralized
deployment, we present an authorization model that is suitable for
scenarios where devices right next to each other (such as a sensor or
a friend’s phone) should be able to communicate securely in a peer-topeer
manner. The model has been deployed as part of an open-source
distributed application framework called Vanadium. As part of this tutorial,
we survey some of the key ideas and techniques used in distributed
authorization, and explain how they are combined in the design of our
model.

Source: http://theory.stanford.edu/~ataly/Papers/fosad16.pdf