Anomaly detection is a critical step towards building a secure and
trustworthy system. e primary purpose of a system log is to
record system states and signicant events at various critical points
to help debug system failures and perform root cause analysis. Such
log data is universally available in nearly all computer systems.
Log data is an important and valuable resource for understanding
system status and performance issues; therefore, the various system
logs are naturally excellent source of information for online
monitoring and anomaly detection. We propose DeepLog, a deep
neural network model utilizing Long Short-Term Memory (LSTM),
to model a system log as a natural language sequence. is allows
DeepLog to automatically learn log paerns from normal execution,
and detect anomalies when log paerns deviate from the model
trained from log data under normal execution. In addition, we
demonstrate how to incrementally update the DeepLog model in
an online fashion so that it can adapt to new log paerns over time.
Furthermore, DeepLog constructs workows from the underlying
system log so that once an anomaly is detected, users can diagnose
the detected anomaly and perform root cause analysis eectively.
Extensive experimental evaluations over large log data have shown
that DeepLog has outperformed other existing log-based anomaly
detection methods based on traditional data mining methodologies
The large-scale monitoring of computer users’ software
activities has become commonplace, e.g., for application
telemetry, error reporting, or demographic profiling. This
paper describes a principled systems architecture—Encode,
Shuffle, Analyze (ESA)—for performing such monitoring
with high utility while also protecting user privacy. The ESA
design, and its PROCHLO implementation, are informed by
our practical experiences with an existing, large deployment
of privacy-preserving software monitoring.
With ESA, the privacy of monitored users’ data is guaranteed
by its processing in a three-step pipeline. First, the data
is encoded to control scope, granularity, and randomness.
Second, the encoded data is collected in batches subject to
a randomized threshold, and blindly shuffled, to break linkability
and to ensure that individual data items get “lost in the
crowd” of the batch. Third, the anonymous, shuffled data is
analyzed by a specific analysis engine that further prevents
statistical inference attacks on analysis results.
ESA extends existing best-practice methods for sensitivedata
analytics, by using cryptography and statistical techniques
to make explicit how data is elided and reduced in
precision, how only common-enough, anonymous data is analyzed,
and how this is done for only specific, permitted purposes.
As a result, ESA remains compatible with the established
workflows of traditional database analysis.
Strong privacy guarantees, including differential privacy,
can be established at each processing step to defend
against malice or compromise at one or more of those steps.
PROCHLO develops new techniques to harden those steps,
including the Stash Shuffle, a novel scalable and efficient
oblivious-shuffling algorithm based on Intel’s SGX, and new
applications of cryptographic secret sharing and blinding.
We describe ESA and PROCHLO, as well as experiments
that validate their ability to balance utility and privacy.
We introduce TensorFlow Agents, an efficient infrastructure paradigm for
building parallel reinforcement learning algorithms in TensorFlow. We simu-
late multiple environments in parallel, and group them to perform the neural
network computation on a batch rather than individual observations. This
allows the TensorFlow execution engine to parallelize computation, without
the need for manual synchronization. Environments are stepped in separate
Python processes to progress them in parallel without interference of the global
interpreter lock. As part of this project, we introduce BatchPPO, an efficient
implementation of the proximal policy optimization algorithm. By open sourc-
ing TensorFlow Agents, we hope to provide a flexible starting point for future
projects that accelerates future research in the field.
Standard machine learning approaches require centralizing the training data on one machine or in a datacenter. And Google has built one of the most secure and robust cloud infrastructures for processing this data to make our services better. Now for models trained from user interaction with mobile devices, we’re introducing an additional approach: Federated Learning.
Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. This goes beyond the use of local models that make predictions on mobile devices (like the Mobile Vision API and On-Device Smart Reply) by bringing model training to the device as well.
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC–called a Tensor Processing Unit (TPU)–deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS), and a large (28MiB) software-managed on-chip memory. The TPU’s deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, …) that help average throughput more than guaranteed latency. The lack of of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters’ NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X – 30X faster than its contemporary GPU or CPU with TOPS/Watt about 30X – 80X higher. Moreover, using the GPUs GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
Take a look at the image below. It’s a collection of bugs and creepy-crawlies of different shapes and sizes. Take a moment to categorize them by similarity into a number of groups.
This isn’t a trick question. Start with grouping the spiders together.
Done? While there’s not necessarily a “correct” answer here, it’s most likely you split the bugs into four clusters. The spiders in one cluster, the pair of snails in another, the butterflies and moth into one, and the trio of wasps and bees into one more.
That wasn’t too bad, was it? You could probably do the same with twice as many bugs, right? If you had a bit of time to spare — or a passion for entomology — you could probably even do the same with a hundred bugs.
For a machine though, grouping ten objects into however many meaningful clusters is no small task, thanks to a mind-bending branch of maths called combinatorics, which tells us that are 115,975 different possible ways you could have grouped those ten insects together. Had there been twenty bugs, there would have been over fifty trillion possible ways of clustering them.
Recent results indicate that the generic descriptors extracted
from the convolutional neural networks are very
powerful. This paper adds to the mounting evidence that
this is indeed the case. We report on a series of experiments
conducted for different recognition tasks using the
publicly available code and model of the OverFeat network
which was trained to perform object classification on
ILSVRC13. We use features extracted from the OverFeat
network as a generic image representation to tackle the diverse
range of recognition tasks of object image classification,
scene recognition, fine grained recognition, attribute
detection and image retrieval applied to a diverse set of
datasets. We selected these tasks and datasets as they gradually
move further away from the original task and data the
OverFeat network was trained to solve. Astonishingly,
we report consistent superior results compared to the highly
tuned state-of-the-art systems in all the visual classification
tasks on various datasets. For instance retrieval it consistently
outperforms low memory footprint methods except for
sculptures dataset. The results are achieved using a linear
SVM classifier (or L2 distance in case of retrieval) applied
to a feature representation of size 4096 extracted from a
layer in the net. The representations are further modified
using simple augmentation techniques e.g. jittering. The
results strongly suggest that features obtained from deep
learning with convolutional nets should be the primary candidate
in most visual recognition tasks