How we use Colossus to improve storage efficiency
Determining whether online users are authorized to access
digital objects is central to preserving privacy. This paper presents the design, implementation, and deployment
of Zanzibar, a global system for storing and evaluating access control lists. Zanzibar provides a uniform data model
and configuration language for expressing a wide range of
access control policies from hundreds of client services at
Google, including Calendar, Cloud, Drive, Maps, Photos,
and YouTube. Its authorization decisions respect causal ordering of user actions and thus provide external consistency
amid changes to access control lists and object contents.
Zanzibar scales to trillions of access control lists and millions
of authorization requests per second to support services used
by billions of people. It has maintained 95th-percentile latency of less than 10 milliseconds and availability of greater
than 99.999% over 3 years of production use
References and links
It is very difficult to find accurate information about the correctness and isolation levels offered by modern distributed databases, and the operational conditions required to achieve them. Developers use different terms for the same thing, the meaning of terms varies or is ambiguous, and sometimes vendors themselves do not actually know.
At Fauna, we care a lot about accurately describing which guarantees different systems actually provide. This is our effort to centralize a description of which database does what, based on publicly available information (documentation, source code, third-party analyses, and developers’ comments). For consistency’s sake, we will use the terminology from Kyle Kingsbury’s explanation on the Jepsen site. The chart is ranked by the maximum multi-partition isolation level offered.
The data is based on statements about isolation levels from vendor documentation, white papers, and developer commentary, exclusive of aspirational marketing statements. We have tried to be neutral in the characterization of the various systems’ architectural properties. Whether the system implementations uphold these guarantees is addressed elsewhere. If you haven’t already, please see FaunaDB’s own Jepsen results for confirmation that FaunaDB upholds its guarantees.
Given the recent explosion of interest in streaming data and online algorithms, clustering of time series
subsequences, extracted via a sliding window, has received much attention. In this work we make a
surprising claim. Clustering of time series subsequences is meaningless. More concretely, clusters extracted
from these time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by
any dataset, and because of this, the clusters extracted by any clustering algorithm are essentially random.
While this constraint can be intuitively demonstrated with a simple illustration and is simple to prove, it has
never appeared in the literature. We can justify calling our claim surprising, since it invalidates the
contribution of dozens of previously published papers. We will justify our claim with a theorem, illustrative
examples, and a comprehensive set of experiments on reimplementations of previous work. Although the
primary contribution of our work is to draw attention to the fact that an apparent solution to an important
problem is incorrect and should no longer be used, we also introduce a novel method which, based on the
concept of time series motifs, is able to meaningfully cluster subsequences on some time series datasets
Abstract—Fallback authentication, the process of recovering
access to an account if the primary authenticator is forgotten
or lost, is of significant importance in real-world applications.
A variety of mechanisms are deployed, ranging from secondary
channels (such as email and SMS), over personal knowledge
questions (such as the “mother’s maiden name”) to social authentication (such as vouching-based approaches). One central
difference with primary authentication is that the elapsed time
between enrollment and authentication can be much longer,
typically in the range of years. However, few of the mechanisms
used today have been studied over such long time-spans, making
claims about their usability difficult to generalize to real-world
applications. Additionally, most past studies have considered one
or two mechanisms only, and deriving a meaningful comparison
of a relevant number of mechanisms from the individual datapoints is not easy. In this work in progress paper, we report on the
design of a usability study that we will use to study the usability
of authentication mechanisms over a more realistic time-frame of
up to 18 months, and will provide a fair comparison of the four
most widely used fallback authentication schemes. We present
results of a pre-study with 74 participants that ran over 4 weeks
and indicates that schemes based on email and SMS are more
usable. Mechanisms based on designated trustees and personal
knowledge questions, on the other hand, fall short, both in terms
of convenience and efficiency.