Gestalt: Fast, unified fault localization for networked systems

Abstract— We show that the performance of existing
fault localization algorithms differs markedly for different
networks; and no algorithm simultaneously provides
high localization accuracy and low computational overhead.
We develop a framework to explain these behaviors
by anatomizing the algorithms with respect to six
important characteristics of real networks, such as uncertain
dependencies, noise, and covering relationships. We
use this analysis to develop Gestalt, a new algorithm that
combines the best elements of existing ones and includes
a new technique to explore the space of fault hypotheses.
We run experiments on three real, diverse networks. For
each, Gestalt has either significantly higher localization
accuracy or an order of magnitude lower running time.
For example, when applied to the Lync messaging system
that is used widely within corporations, Gestalt localizes
faults with the same accuracy as Sherlock, while
reducing fault localization time from days to 23 seconds.

Source: https://www.usenix.org/system/files/conference/atc14/atc14-paper-mysore.pdf

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s