Pregel: A System for Large-Scale Graph Processing

Many practical computing problems concern large graphs.
Standard examples include the Web graph and various social
networks. The scale of these graphs—in some cases billions
of vertices, trillions of edges—poses challenges to their
efficient processing. In this paper we present a computational
model suitable for this task. Programs are expressed
as a sequence of iterations, in each of which a vertex can
receive messages sent in the previous iteration, send messages
to other vertices, and modify its own state and that of
its outgoing edges or mutate graph topology. This vertexcentric
approach is flexible enough to express a broad set of
algorithms. The model has been designed for efficient, scalable
and fault-tolerant implementation on clusters of thousands
of commodity computers, and its implied synchronicity
makes reasoning about programs easier. Distribution related
details are hidden behind an abstract API. The result
is a framework for processing large graphs that is expressive
and easy to program.



Graph Cube: On Warehousing and OLAP Multidimensional Networks

We consider extending decision support facilities toward large
sophisticated networks, upon which multidimensional attributes
are associated with network entities, thereby forming
the so-called multidimensional networks. Data warehouses
and OLAP (Online Analytical Processing) technology
have proven to be effective tools for decision support on
relational data. However, they are not well-equipped to handle
the new yet important multidimensional networks. In
this paper, we introduce Graph Cube, a new data warehousing
model that supports OLAP queries effectively on large
multidimensional networks. By taking account of both attribute
aggregation and structure summarization of the networks,
Graph Cube goes beyond the traditional data cube
model involved solely with numeric value based group-by’s,
thus resulting in a more insightful and structure-enriched
aggregate network within every possible multidimensional
space. Besides traditional cuboid queries, a new class of
OLAP queries, crossboid, is introduced that is uniquely useful
in multidimensional networks and has not been studied
before. We implement Graph Cube by combining special
characteristics of multidimensional networks with the existing
well-studied data cube techniques. We perform extensive
experimental studies on a series of real world data sets and
Graph Cube is shown to be a powerful and efficient tool for
decision support on large multidimensional networks.