Automatically Inferring Malware Signatures for Anti-Virus Assisted Attacks

Although anti-virus software has significantly evolved over
the last decade, classic signature matching based on byte
patterns is still a prevalent concept for identifying security
threats. Anti-virus signatures are a simple and fast detection
mechanism that can complement more sophisticated analysis
strategies. However, if signatures are not designed with care,
they can turn from a defensive mechanism into an instrument
of attack. In this paper, we present a novel method for
automatically deriving signatures from anti-virus software
and discuss how the extracted signatures can be used to
attack sensible data with the aid of the virus scanner itself.
To this end, we study the practicability of our approach
using four commercial products and exemplary demonstrate
anti-virus assisted attacks in three different scenarios.

Source: https://www.sec.cs.tu-bs.de/pubs/2017-asiaccs.pdf

Advertisements

Blocking-resistant communication through domain fronting

Abstract: We describe “domain fronting,” a versatile
censorship circumvention technique that hides the remote
endpoint of a communication. Domain fronting
works at the application layer, using HTTPS, to communicate
with a forbidden host while appearing to communicate
with some other host, permitted by the censor.
The key idea is the use of different domain names at
different layers of communication. One domain appears
on the “outside” of an HTTPS request—in the DNS request
and TLS Server Name Indication—while another
domain appears on the “inside”—in the HTTP Host
header, invisible to the censor under HTTPS encryption.
A censor, unable to distinguish fronted and nonfronted
traffic to a domain, must choose between allowing
circumvention traffic and blocking the domain entirely,
which results in expensive collateral damage. Domain
fronting is easy to deploy and use and does not require
special cooperation by network intermediaries. We
identify a number of hard-to-block web services, such as
content delivery networks, that support domain-fronted
connections and are useful for censorship circumvention.
Domain fronting, in various forms, is now a circumvention
workhorse. We describe several months of deployment
experience in the Tor, Lantern, and Psiphon circumvention
systems, whose domain-fronting transports
now connect thousands of users daily and transfer many
terabytes per month.

Source: https://www.bamsoftware.com/papers/fronting.pdf

CAMP: Content-Agnostic Malware Protection

In spite of recent advances, the world wide web remains
an important vector for malware installation. Approaches
to evaluating potentially malicious code before execution in a
browser, such as blacklisting or content-based detection are
hindered by an attacker’s ability to easily change hosting domains
or mutate malware binaries. On the other hand, whitelistbased
approaches are challenged by the large, dynamic, and
heterogeneous space of benign binaries that they must track. In
practice these approaches continue to provide value for popular
binaries at either extreme of maliciousness (e.g., the current large
outbreak of malware, the benign binaries shipped with an OS),
but bridging the gap between whitelist and blacklist detection
for web malware remains a significant challenge.
This paper presents CAMP, a content-agnostic malware protection
system based on binary reputation that is designed to
address these shortcomings. CAMP is built into the browser and
determines the reputation of most downloads locally, relying on
server-side reputation data only when a local decision cannot be
made. This paper gives a detailed overview of CAMP and its
architecture and provides an evaluation of the system through
a six-month deployment in which 200 million users of Google
Chrome requested between eight to ten million reputation requests
a day. Our evaluation shows that CAMP exhibits accuracy
close to 99% relative to proprietary VM-based dynamic analysis,
is able to process requests in less than 130 ms on average, and
was able to detect approximately five million intentional malware
downloads per month that were not detected by existing solutions.

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41137.pdf

Learning detector of malicious network traffic from weak labels

Abstract. We address the problem of learning a detector of malicious
behavior in network traffic. The malicious behavior is detected based on
the analysis of network proxy logs that capture malware communication
between client and server computers. The conceptual problem in using
the standard supervised learning methods is the lack of sufficiently representative
training set containing examples of malicious and legitimate
communication. Annotation of individual proxy logs is an expensive process
involving security experts and does not scale with constantly evolving
malware. However, weak supervision can be achieved on the level of
properly defined bags of proxy logs by leveraging internet domain black
lists, security reports, and sandboxing analysis. We demonstrate that
an accurate detector can be obtained from the collected security intelligence
data by using a Multiple Instance Learning algorithm tailored
to the Neyman-Pearson problem. We provide a thorough experimental
evaluation on a large corpus of network communications collected from
various company network environments.

Source: http://cmp.felk.cvut.cz/ftp/articles/franc/Franc-Malware-ECML2015.pdf

Deep Learning for Classification of Malware System Call Sequences

Abstract. The increase in number and variety of malware samples amplifies
the need for improvement in automatic detection and classification
of the malware variants. Machine learning is a natural choice to cope
with this increase, because it addresses the need of discovering underlying
patterns in large-scale datasets. Nowadays, neural network methodology
has been grown to the state that can surpass limitations of previous
machine learning methods, such as Hidden Markov Models and Support
Vector Machines. As a consequence, neural networks can now offer superior
classification accuracy in many domains, such as computer vision or
natural language processing. This improvement comes from the possibility
of constructing neural networks with a higher number of potentially
diverse layers and is known as Deep Learning.
In this paper, we attempt to transfer these performance improvements
to model the malware system call sequences for the purpose of malware
classification. We construct a neural network based on convolutional and
recurrent network layers in order to obtain the best features for classi-
fication. This way we get a hierarchical feature extraction architecture
that combines convolution of n-grams with full sequential modeling. Our
evaluation results demonstrate that our approach outperforms previously
used methods in malware classification, being able to achieve an average
of 85.6% on precision and 89.4% on recall using this combined neural
network architecture.

Source: https://www.sec.in.tum.de/assets/Uploads/deeplearning.pdf

A Comprehensive Measurement Study of Domain Generating Malware

Recent years have seen extensive adoption of domain generation algorithms (DGA) by modern botnets. The main goal is to generate a large number of domain names and then use a small subset for actual C&C communication. This makes DGAs very compelling for botmasters to harden the infrastructure of their botnets and make it resilient to blacklisting and attacks such as takedown efforts. While early DGAs were used as a backup communication mechanism, several new botnets use them as their primary communication method, making it extremely important to study DGAs in detail.

In this paper, we perform a comprehensive measurement study of the DGA landscape by analyzing 43 DGAbased malware families and variants. We also present a taxonomy for DGAs and use it to characterize and compare the properties of the studied families. By reimplementing the algorithms, we pre-compute all possible domains they generate, covering the majority of known and active DGAs. Then, we study the registration status of over 18 million DGA domains and show that corresponding malware families and related campaigns can be reliably identified by pre-computing future DGA domains. We also give insights into botmasters’ strategies regarding domain registration and identify several pitfalls in previous takedown efforts of DGA-based botnets. We will share the dataset for future research and will also provide a web service to check domains for potential DGA identity.

Source: https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_plohmann.pdf

UNVEIL: A Large-Scale, Automated Approach to Detecting Ransomware

Although the concept of ransomware is not new (i.e.,
such attacks date back at least as far as the 1980s), this
type of malware has recently experienced a resurgence
in popularity. In fact, in the last few years, a number
of high-profile ransomware attacks were reported, such
as the large-scale attack against Sony that prompted the
company to delay the release of the film “The Interview.”
Ransomware typically operates by locking the desktop
of the victim to render the system inaccessible to the
user, or by encrypting, overwriting, or deleting the user’s
files. However, while many generic malware detection
systems have been proposed, none of these systems have
attempted to specifically address the ransomware detection
problem.
In this paper, we present a novel dynamic analysis system
called UNVEIL that is specifically designed to detect
ransomware. The key insight of the analysis is that
in order to mount a successful attack, ransomware must
tamper with a user’s files or desktop. UNVEIL automatically
generates an artificial user environment, and detects
when ransomware interacts with user data. In parallel,
the approach tracks changes to the system’s desktop
that indicate ransomware-like behavior. Our evaluation
shows that UNVEIL significantly improves the state
of the art, and is able to identify previously unknown
evasive ransomware that was not detected by the antimalware
industry.

Source: http://www.ccs.neu.edu/home/mkharraz/publications/unveil-USENIX.pdf