Today’s graphs used in domains such as machine learning or
social network analysis may contain hundreds of billions of
edges. Yet, they are not necessarily stored efficiently, and standard
graph representations such as adjacency lists waste a
significant number of bits while graph compression schemes
such as WebGraph often require time-consuming decompression.
To address this, we propose Log(Graph): a graph representation
that combines high compression ratios with very
low-overhead decompression to enable cheaper and faster
graph processing. The key idea is to encode a graph so that
the parts of the representation approach or match the respective
storage lower bounds. We call our approach “graph
logarithmization” because these bounds are usually logarithmic.
Our high-performance Log(Graph) implementation
based on modern bitwise operations and state-of-the-art succinct
data structures achieves high compression ratios as well
as performance. For example, compared to the tuned Graph
Algorithm Processing Benchmark Suite (GAPBS), it reduces
graph sizes by 20-35% while matching GAPBS’ performance
or even delivering speedups due to reducing amounts of
transferred data. It approaches the compression ratio of the
established WebGraph compression library while enabling
speedups of up to more than 2×. Log(Graph) can improve
the design of various graph processing engines or libraries on
single NUMA nodes as well as distributed-memory systems
What are they, how do they work, and are they fast yet?
Abstract— We show that the performance of existing
fault localization algorithms differs markedly for different
networks; and no algorithm simultaneously provides
high localization accuracy and low computational overhead.
We develop a framework to explain these behaviors
by anatomizing the algorithms with respect to six
important characteristics of real networks, such as uncertain
dependencies, noise, and covering relationships. We
use this analysis to develop Gestalt, a new algorithm that
combines the best elements of existing ones and includes
a new technique to explore the space of fault hypotheses.
We run experiments on three real, diverse networks. For
each, Gestalt has either significantly higher localization
accuracy or an order of magnitude lower running time.
For example, when applied to the Lync messaging system
that is used widely within corporations, Gestalt localizes
faults with the same accuracy as Sherlock, while
reducing fault localization time from days to 23 seconds.
In this work we present Picasso: a lightweight device class fingerprinting protocol that allows a server to verify the software and hardware stack of a mobile or desktop client. As an example, Picasso can distinguish between traffic sent by an authentic iPhone running Safari on iOS from an emulator or desktop client spoofing the same configuration. Our fingerprinting scheme builds on unpredictable yet stable noise introduced by a client’s browser, operating system, and graphical stack when rendering HTML5 canvases. Our algorithm is resistant to replay and includes a hardware-bound proof of work that forces a client to expend a configurable amount of CPU and memory to solve challenges. We demonstrate that Picasso can distinguish 52 million Android, iOS, Windows, and OSX clients running a diversity of browsers with 100% accuracy. We discuss applications of Picasso in abuse fighting, including protecting the Play Store or other mobile app marketplaces from inorganic interactions; or identifying login attempts to user accounts from previously unseen device classes.
At IEEE Security & Privacy 2015, Bos, Costello, Naehrig, and Stebila proposed an instantiation of Peikert’s ring-learning-with-errors–based (Ring-LWE) key-exchange protocol (PQCrypto 2014), together with an implementation integrated into OpenSSL, with the affirmed goal of providing post-quantum security for TLS. In this work we revisit their instantiation and stand-alone implementation. Specifically, we propose new parameters and a better suited error distribution, analyze the scheme’s hardness against attacks by quantum computers in a conservative way, introduce a new and more efficient error-reconciliation mechanism, and propose a defense against backdoors and all-for-the-price-of-one attacks. By these measures and for the same lattice dimension, we more than double the security parameter, halve the communication overhead, and speed up computation by more than a factor of 8 in a portable C implementation and by more than a factor of 27 in an optimized implementation targeting current Intel CPUs. These speedups are achieved with comprehensive protection against timing attacks.
Machine learning (ML) models may be deemed confidential
due to their sensitive training data, commercial
value, or use in security applications. Increasingly often,
confidential ML models are being deployed with publicly
accessible query interfaces. ML-as-a-service (“predictive
analytics”) systems are an example: Some allow
users to train models on potentially sensitive data and
charge others for access on a pay-per-query basis.
The tension between model confidentiality and public
access motivates our investigation of model extraction
attacks. In such attacks, an adversary with black-box access,
but no prior knowledge of an ML model’s parameters
or training data, aims to duplicate the functionality
of (i.e., “steal”) the model. Unlike in classical learning
theory settings, ML-as-a-service offerings may accept
partial feature vectors as inputs and include confidence
values with predictions. Given these practices, we show
simple, efficient attacks that extract target ML models
with near-perfect fidelity for popular model classes including
logistic regression, neural networks, and decision
trees. We demonstrate these attacks against the online
services of BigML and Amazon Machine Learning.
We further show that the natural countermeasure of omitting
confidence values from model outputs still admits
potentially harmful model extraction attacks. Our results
highlight the need for careful ML model deployment and new model extraction countermeasures.
We introduce a new and increasingly relevant setting for distributed optimization in machine learning,
where the data defining the optimization are distributed (unevenly) over an extremely large
number of nodes, but the goal remains to train a high-quality centralized model. We refer to this setting
as Federated Optimization. In this setting, communication efficiency is of utmost importance.
A motivating example for federated optimization arises when we keep the training data locally on
users’ mobile devices rather than logging it to a data center for training. Instead, the mobile devices
are used as nodes performing computation on their local data in order to update a global model. We
suppose that we have an extremely large number of devices in our network, each of which has only
a tiny fraction of data available totally; in particular, we expect the number of data points available
locally to be much smaller than the number of devices. Additionally, since different users generate
data with different patterns, we assume that no device has a representative sample of the overall
We show that existing algorithms are not suitable for this setting, and propose a new algorithm which
shows encouraging experimental results. This work also sets a path for future research needed in the
context of federated optimization.