JUNG (Java Universal Network/Graph) Framework Manual

Table of Contents

  1. Introduction

    1. Why JUNG Exists

    2. What JUNG Is And Is Not

    3. Using This Manual

      1. Nomenclature
    4. How To Contact The JUNG Project

  2. Getting Started

  3. Graphs, Vertices, and Edges

    1. Basic Properties and Operations

    2. Types

    3. Creating and Adding

    4. Copying and Equivalency

    5. Removing Vertices and Edges

  4. User Data

    1. Class Extension

    2. JUNG Annotation

    3. Copying User Data

    4. Decorators, Indexers, and Labellers

  5. Input and Output

  6. Algorithms

    1. Graph/Matrix Operations

    2. Clustering

    3. Topology, Paths, and Flows

    4. Importance

    5. Optimization

    6. Statistics

    7. Permutations

  7. Filtering

  8. Visualization

  9. Utilities

  10. Sample Code


Introduction

Overview

JUNG is an open-source software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. It is written in Java, which allows JUNG-based applications to make use of the extensive built-in capabilities of the Java API, as well as those of other existing third-party Java libraries.

The JUNG architecture is designed to support a variety of representations of entities and their relations, such as directed and undirected graphs, multi-modal graphs, graphs with parallel edges, and hypergraphs. It provides a mechanism for annotating graphs, entities, and relations with metadata. This facilitates the creation of analytic tools for complex data sets that can examine the relations between entities as well as the metadata attached to each entity and relation.

The current distribution of JUNG includes implementations of a number of algorithms from graph theory, data mining, and social network analysis, such as routines for clustering, decomposition, optimization, random graph generation, statistical analysis, and calculation of network distances, flows, and importance measures (centrality, PageRank, HITS, etc.).

JUNG also provides a visualization framework that makes it easy to construct tools for the interactive exploration of network data. Users can use one of the layout algorithms provided, or use the framework to create their own custom layouts. In addition, filtering mechanisms are provided which allow users to focus their attention, or their algorithms, on specific portions of the graph.

We hope that JUNG will make it easier for those who work with graphs, networks, and relational data to make use of one another's development efforts.

Why JUNG Exists

The JUNG Project was created because we perceived a need for a general, flexible, and powerful API for manipulating, analyzing, and visualizing graphs and networks in Java--and because none of the other tools/APIs that we found fit our specific requirements for the projects that we were working on at the time. However, there are other tools available for manipulating networks, which may be more appropriate for you, depending on your specific needs and abilities. The JUNG FAQ summarizes some of the basic distinctions between JUNG and other tools for network analysis and visualization.

What JUNG Is And Is Not

JUNG is a framework on which applications and tools for handling and visualizing graph and network data can be built. It can be used in simple snippets of code to test ideas, or in the context of a sophisticated tool with a graphic user interface.

JUNG is not itself a finished tool (nor is it intended to be one). You can build a tool that uses JUNG, but this will require some Java programming. The JUNG distribution does provide some samples of small applications that use JUNG to accomplish certain tasks, but they are intended to be examples of how one might use JUNG, not tools in their own right.

If you are not an experienced Java programmer, but would like to learn to use Java, we recommend that you visit the website for The Java Tutorial.

Using This Manual

This manual is intended for people who intend to write programs using JUNG. You will get the most out of this manual (and out of JUNG) if you have some knowledge of Java, programming, and basic graph theoretic concepts and algorithms (graphs/networks, nodes/vertices, arcs/edges). This manual will not tell you how to write programs in Java, or tell you the details of how the algorithms work.

You may find it useful to have the Javadoc documentation handy when reading this manual, as it will provide specific details for how to use the methods that this manual will refer to.

This document is sometimes updated less frequently than the release notes and the Javadoc documentation. If there is a conflict between this manual and the release notes or source code documentation, the release notes/Javadoc are generally considered to be definitive.

Nomenclature

The JUNG Framework is organized in a number of different packages, all of which (except where otherwise noted) share the prefix edu.uci.ics.jung. To simplify the presentation, this prefix will be omitted except where necessary for clarity; thus, for example, the package edu.uci.ics.jung.graph.impl will be referred to as graph.impl.

We have attempted to use a consistent terminology in this manual, but the Javadoc documentation is more heterogenous in this regard. Thus, the JUNG Project considers the following pairs of terms to be essentially synonymous (although the first of each is generally preferred):

  • graph and network
  • vertex and node
  • edge and arc

How To Contact The JUNG Project

The JUNG Project can be contacted via information posted on the JUNG website. Please feel free to contact us with bug reports, feature requests, and information on related projects. Suggestions for how the manual and other documentation could be made more clear would be especially appreciated.


Getting Started

In order to run a JUNG-based application, you will need all the packages listed on the JUNG download page. (As of this writing, this includes .jar files for JUNG, version 1.3 (or later) of a Java runtime environment (JRE), Apache Jakarta commons-collections, the CERN Colt scientific library, and Xerces.

Building JUNG-based applications will require all of the above, plus the Java SDK (version 1.3 or later).

For instructions on how to set up your computer to run (or develop) Java applications, (including downloading and installing the JRE or Java SDK), we recommend Sun's website: New-To-Java Programming Center: Getting Started, and in particular the article called "Getting Started" which is linked to from that page. You may also find some of the other articles in the New-to-Java Programming Center to be of use.

Graphs, Vertices, and Edges

Basic Properties and Operations

Graphs, vertices, and edges each have several properties that can be extracted from them, and operations that they can perform (or have performed upon them). The operations listed below are all guaranteed to be defined and to behave as specified for all JUNG graphs, vertices, and edges. Depending on the specific type of graph, vertex, or edge, and on the implementation used, a given graph, vertex, or edge object may have other available properties and/or operations. For specific details on the use and behavior of these operations, see the Javadoc documentation for the appropriate class(es).

Graphs: Vertices: Edges:

Types

The graph package contains specifications (in the form of Java interfaces), at various levels of abstraction, for graphs, vertices, and edges.

ArchetypeGraph, ArchetypeVertex, and ArchetypeEdge

The Archetype interfaces specify the behavior of generalized graphs, vertices, and edges; they are designed to encompass all types of graphs, including directed and undirected graphs, graphs with attached data (e.g., weighted edges), hypergraphs, and graphs with parallel edges. All graph, vertex, and edge implementations should implement the appropriate one of these interfaces (or an interface which inherits from these interfaces). The methods listed above are those available to objects which implement one of these interfaces.

Graph, Vertex, and Edge

These interfaces inherit from the Archetype interfaces, and specify the behavior for (binary) graphs in which each edge connects exactly two vertices; this limitation allows a number of new methods to be defined.

DirectedGraph and DirectedEdge

A DirectedEdge is a type of Edge which imposes an ordering on its incident vertices. DirectedGraph is a tagging interface for implementations of Graph whose edge set consists of implementations of DirectedEdge. Thus, for example, the author of a method which is designed to operate only on directed graphs should specify that the graph argument to the method could be DirectedGraph.

UndirectedGraph and UndirectedEdge

These interfaces are exactly analogous to the corresponding interfaces for directed graphs and edges (mentioned above), but for undirected graphs and edges (which do not impose an ordering on their incident vertices), instead.

The graph.impl package contains several implementations of these specifications in the graph package. Currently all the code in this package is designed to support binary graphs.

AbstractSparseGraph, AbstractSparseVertex, and AbstractSparseEdge

These classes are skeletal implementations of the Graph, Vertex, and Edge interfaces. Their intent is to minimize the effort required to implement these interfaces, while not committing to details such as whether the graph is directed or not.

These classes, and those that inherit from them, are designed for sparse graphs (ones in which the number of edges is only a few times as large as the number of vertices). They may not be the best implementations for representing and manipulating dense graphs (ones in which most vertices are connected to most other vertices).

Since these are abstract classes, users cannot create instances of them.

DirectedSparseGraph, DirectedSparseVertex, and DirectedSparseEdge

These classes extend the Abstract graph, vertex, and edge classes for directed graphs; the graph and edge classes implement the DirectedGraph and DirectedEdge interfaces, respectively.

UndirectedSparseGraph, UndirectedSparseVertex, and UndirectedSparseEdge

These classes extend the Abstract graph, vertex, and edge classes for undirected graphs; the graph and edge classes implement the UndirectedGraph and UndirectedEdge interfaces, respectively.

Creating and Adding

Creating a graph is straightforward: just call the constructor for the type of graph that you want. For example:

	Graph g = new DirectedSparseGraph();

creates a new directed sparse graph and assigns it to a variable of type Graph. (See the Javadoc documentation for details on the Graph type and the DirectedSparseGraph implementation.)

You can also create a graph by reading it in from a file (covered in the Input and Output section) or by generating a random graph (covered in the Algorithms section).

Once you have created a graph, you can create vertices and add them to this graph:

	Vertex v1 = (Vertex) g.addVertex(new DirectedSparseVertex());
	Vertex v2 = (Vertex) g.addVertex(new DirectedSparseVertex());

and once you have vertices, you can connect them with edges:

	DirectedEdge e = (DirectedEdge) g.addEdge(new DirectedSparseEdge(v1, v2));

Note that creating vertices/edges and adding them to a graph are actually two different operations, which we combine here into a single line of code. The two-stage nature of this process makes it possible to create "orphaned" vertices/edges that are not part of a graph. This was done as a compromise between common practices in Java APIs regarding the side effects of constructors, and the semantics of graphs. However, the behavior of the JUNG edge and vertex methods, with the exception of getGraph(), is unspecified on orphaned vertices/edges. The JUNG Project implementations will never create orphaned vertices/edges, and we strongly recommend that users follow this practice by nesting the call to the vertex/edge constructor inside the call to the graph method that adds its argument to the graph (as in the examples above).

Some constraints to keep in mind:

If any of these constraints are violated, the error will be caught at runtime, and a FatalExcecption will be thrown. Note that while most of these constraints are fail-fast (that is, they will be caught as soon as the error is encountered), some of them will not be so indicative, and will fail more subtly.

Copying and Equivalency

You can make a copy of a graph, or copy a vertex or edge from one graph (the original graph) to another graph (the target graph).

Copying a vertex or edge does three things:

Copying a graph does three things:

The following code creates a graph, creates two vertices and an edge and adds them to this graph, then copies each vertex and edge from the original graph to a new target graph.

	Graph original = new DirectedSparseGraph();
	Vertex v1_orig = original.addVertex(new DirectedSparseVertex());
	Vertex v2_orig = original.addVertex(new DirectedSparseVertex());
	DirectedEdge e_orig = original.addEdge(new DirectedSparseEdge(v1, v2));

	Graph target = new DirectedSparseGraph();
	Vertex v1_copy = v1.copy(target);
	Vertex v2_copy = v2.copy(target);
	DirectedEdge e_copy = e_orig.copy(target);

The vertices v1_copy and v2_copy are equivalent to the vertices v1_orig and v2_orig, respectively, and the edge e_copy is equivalent to the edge e_orig. That is, each of the following statements evaluates to true:

	v1_orig == v1_copy.getEquivalentVertex(original);
	v2_orig == v2_copy.getEquivalentVertex(original);
	v1_copy == v1_orig.getEquivalentVertex(target);
	v2_copy == v2_orig.getEquivalentVertex(target);
	e_orig == e_copy.getEquivalentEdge(original);
	e_copy == e_orig.getEquivalentEdge(target);

As a convenience, the Java equals method has been implemented to respect this equivalence relation, so the following statements also each evaluate to true:

	v1_orig.equals(v1_copy);
	v1_copy.equals(v1_orig);
	v2_orig.equals(v2_copy);
	v2_copy.equals(v2_orig);
	e_orig.equals(e_copy);
	e_copy.equals(e_orig);

There are some restrictions that govern when and where vertices and edges may be copied:

Removing Vertices and Edges

To remove a vertex or edge from a graph, call the appropriate removal method:

	g.removeEdge(e);
	g.removeVertex(v1);

Removing an edge from a graph will not affect any other part of the graph. Removing a vertex from a graph may cause the edges that are incident to that vertex to be removed if these edges would otherwise become ill-formed. (An ill-formed edge is one that is incident to the wrong number of vertices. In graphs where edges are defined to connect exactly two vertices, removing a vertex will result in the removal of all of its incident edges.)

Removing an element from a graph does not free the memory used by that object. (In fact, you can remove an element from a graph and then re-insert it in that graph or in a different graph). As with all Java programs, the Java garbage collector is responsible for freeing the memory for an object once it is no longer being used. Removing an element from a graph also does not remove it from any user data structures (discussed in the section entitled "User Data"); users are responsible for updating the user data as necessary.


User Data

Users can associate data with graphs, edges, or vertices in two ways: class extension and the built-in JUNG annotation mechanism.

Class Extension

Users can extend the classes provided so that they include the variables/properties (and methods for manipulating those fields) that the user desires. This mechanism is most appropriate for applications which are designed to operate on a specific data set, each of whose elements have known properties. For instance, a network representing a highway system might store, for each segment of highway between interchanges (i.e., edge), the length of that segment.

The ability to extend the JUNG classes is a feature of the Java language, and is not specific to JUNG. However, class extenders should note that the AbstractSparse classes use the Object.clone() method to copy Vertices, Edges, and Graphs; therefore, copies of objects will be "shallow" copies. (For more information on cloning objects, and on "deep" and "shallow" copies, see the Java Tutorial).

This sample code creates a class that extends DirectedSparseVertex and carries with it some of its own sample data. Note that the routine looks exactly like a standard Java extension of the class.

class PersonVertex extends DirectedSparseVertex {
    private String name;
    private List publications;     

    public PersonVertex( String name, List publications ) {
       this.name = name;
       this.publications = publications;
    }

    public List getPublications() { return publications; }
}

There may be circumstances under which it is better to store information through the annotation mechansims discussed below, or the class extension system. In particular, the annotation mechanisms are substantially more flexible than class extension.

JUNG Annotation

The JUNG utils package provides a built-in mechanism, the UserData class, for annotating graph elements with data. This mechanism is most appropriate for handling data which is either temporary or idiosyncratic (i.e., data which not every graph element of that type will have or need).

Each of the JUNG graph, vertex, and edge implementations extends UserData, which provides the following operations:

(The purpose and semantics of copy actions are discussed in the section below entitled Copying User Data.)

Here is a simple example of how this user data may be stored, accessed, modified, and removed:

	Vertex v = (Vertex) g.addVertex(new DirectedSparseVertex());
	Vertex w = (Vertex) g.addVertex(new DirectedSparseVertex());
	String name_key = "name";
	String current_address_key = "address";
	String current_student_key = "student";
	v.addUserDatum(name_key, "Carl Jung", UserData.SHARED);
	w.addUserDatum(name_key, "Sigmund Freud", UserData.SHARED);
	v.addUserDatum(current_address_key, "Vienna, Austria", UserData.SHARED);
	v.addUserDatum(current_student_key, w, UserData.REMOVE);  // Freud is a student of Jung!
	...
	String v_name = v.getUserDatum(namekey);
	v.setUserDatum(current_address_key, "Basel, Switzerland", UserData.SHARED);
	v.removeUserDatum(current_student_key);

This example shows that userdata can contain any java object, including other vertices.

Copying User Data

When a graph element a is copied (with the copy method), the newly created element b calls importUserData(a), which attempts to copy each of the objects in a's user data repository to b's user data repository. The behavior of each such copy attempt will depend on the copy action that was specified when the corresponding user data element was created.

The interface UserDataContainer contains an interface called CopyAction, which consists of a single method signature, onCopy(value, source, target). importUserData(a) retrieves the copy action (which is an implementation of CopyAction) for each element in a's user data repository. This copy action then calls onCopy(datum, a, b), and based on the result, decides what to do with the specified datum.

JUNG provides three different implementations of CopyAction: UserData.CLONE, UserData.REMOVE, and UserData.SHARED.

UserData.CLONE's version of onCopy() returns a copy of the user datum, as defined by the Java clone() method; importUserData then places this copy in the target graph element's user data repository. This clone is completely independent of the original. (If the user datum does not support the clone() method, onCopy will throw the Java CloneNotSupportedException.)

UserData.SHARED's version of onCopy() returns a reference to the original user datum; importUserData then places this reference in the target graph element's user data repository. Thus, any changes to this user datum that are made by one of the graph elements that share this user datum will be reflected in all such graph elements.

UserData.REMOVE's version of onCopy() returns null; that is, user data that is created with this copy action will not be copied by the copy() method.

Decorators, Indexers, and Labellers

JUNG includes a few convenience classes that provide examples of structured uses of the user data repositories; these may be found in the package graph.decorators. We will briefly discuss two of these classes here; for more details and to see what other examples are available, see the Javadoc documentation.

Indexer

An Indexer contains methods that create a mapping between the vertices of a graph and the integers {0, 1, ... n-1} (where n is the number of vertices in the graph). It provides mechanisms to get the index of a given vertex (getIndex(v)) and to get the vertex with a specified index (getVertex(i)). Among other things, Indexer thus makes it convenient to arrange a set of vertices in an array, using each vertex's index as an index into the array.

Note: if a graph that has been indexed is modified, the user must call updateIndex in order to make sure that all vertices are indexed properly.

StringLabeller

A StringLabeller is similar to an Indexer; it provides facilities for fetching vertices given strings (labels) and vice versa. However, the labels are user-defined and thus need not follow any particular pattern. Vertices that have not been labelled simply will not be accessible by the indexer.


Input and Output

JUNG currently provides partial support for two different formats: the Pajek format (reading and writing) and the GraphML file format (reading only). The relevant classes and methods may be found in the io package.

Note: Check the Javadoc documentation for a list of the commands and tags that are currently supported by the JUNG readers and writers.

The JUNG Project does not have a native file format for graph representation.


Algorithms

JUNG provides a number of different graph and network algorithms. Generally speaking, each type of algorithm has its own package.

This is one of the most complex and rapidly evolving aspects of JUNG; make sure that you check the Javadoc documentation and release notes to make sure that you have the latest information available on the operations available, and how to use them.

Graph/Matrix Operations

These algorithms reside in the main algorithms package (in the class GraphMatrixOperations and the interface MatrixElementOperations).

Matrices are one common representation for network data. GraphMatrixOperations is comprised of two classes of operations: those that are typically cast as matrix operations, but can be efficiently implemented by taking advantage of the structure of the graph; and operations that use the CERN Colt package for numerical matrix manipulation.

Some of the operations in GraphMatrixOperations require an implementation of MatrixElementOperations, to specify the behavior of the operation for a particular element type. Essentially, the methods of MatrixElementOperations specify how the equivalent of the vector inner (dot) product is to proceed. For example, implementations of this interface will specify whether the elements of the graph should be treated as numeric or Boolean values.

Currently, GraphMatrixOperations consists of the following operations:

The matrix types that are returned by graphToSparseMatrix and mapTo1DMatrix can be passed to the CERN Colt libraries, which allows users to make use of the tools provided by those libraries on their JUNG graphs.

Clustering

These algorithms reside in the algorithms.cluster package.

A cluster is a collection of objects that are all similar to each other in some way. In a network, similarity is often based on topological properties such as connectivity, but can also be based on the properties of vertices or edges in the network.

Currently, this package provides the following classes:

Topology, Paths, and Flows

These algorithms perform operations on (and calculate properties of) graphs that relate to the graph's topology (that is, the structures and substructures formed by the ways that the vertices are linked together by edges).

This category of algorithms includes the following classes:

Importance

These algorithms reside in the algorithms.importance package.

Network importance algorithms measure the importance of each vertex (or edge) according to a set of criteria that is usually based on the positioning of the vertex/edge relative to the rest of the graph.

Some of the following algorithms assume that they are given a Markov network: a directed weighted graph in which the vertices represent states, the edges represent possible state transitions, and the edge weights represent transition probabilities. The stationary probability for a vertex v in such a network is the limiting probability that, given an arbitrary starting state and a large number of transitions, the current state will be that of v.

This category of algorithms includes the following classes (in addition to other classes that provide a structural relationship between these classes):

Optimization

[under construction]

Statistics

These algorithms reside in the statistics package, and include the following classes:

Permutations

[under construction]


Filtering

The JUNG filtering mechanism removes selected vertices and edges from input graphs, and returns new graphs. These new graphs are copies of the original, containing all the same vertices and edges except for those that have been removed. A Filter takes in a Graph, and returns an UnassembledGraph.

An UnassembledGraph is a temporary storage mechanism for nodes and edges: it holds all the vertices (and at least all the edges) that will be placed into the final, filtered graph. In some circumstances, just knowing which vertices pass the filter is sufficient; this information can be accessed directly from the UnassembledGraph with the calls getUntouchedEdges() and getUntouchedVertices(), which return the set of edges that passed the filter, and the set of vertices that passed the filter, respectively. However, most of the time, one wants to access the new graph that passes the filter; this is done with the UnassembledGraph method called assemble(), which builds the new graph. assemble() copies every vertex that passed the filter into the new graph, and then copies each edge that passed the original filter into the new graph if both of its incident vertices also passed the filter (thus ensuring that the resulting graph is well-formed). Note that this means that some edges returned by getUntouchedEdges() will not be copied into the new graph.

assemble() can be slow, so it is sometimes desirable to string together several filters in a row, and not call assemble until the last Filter has been run. This is done by creating a filter that implements the EfficientFilter interface. An EfficientFilter is a type of Filter that can filter an UnassembledGraph, and return another UnassembledGraph. A filter which examines structural properties of graphs is probably not appropriate to implement as an EfficientFilter, because UnassembledGraphs may contain incorrect topology information (in particular, as noted above, the edge set may include some ill-formed edges). It is the responsibility of the user to determine whether a given filtering mechanism can be implemented as an EfficientFilter.

While a user can write a custom filter merely by implementing the interface, it is often easiest to extend one of the two provided base Filter classes, VertexAcceptFilter and EdgeAcceptFilter. Both of these require the user to write a method--boolean acceptVertex(vertex) or boolean acceptEdge(edge), respectively. By default, these are not declared to be EfficientFilters; however, users may certainly create extensions of these filters that are EfficientFilters.

The SerialFilter mechanism applies a series of filters sequentially to a specified graph, in the order in which they were added to the SerialFilter. As the filters are applied, it checks to see whether each one is an EfficientFilter, and calls assemble as necessary.

The LevelFilter interface was designed to be used in conjunction with the GraphDraw mechanism. LevelFilters are filters that take an integer parameter, which is used to determine the operation of the filter (for instance, filtering all edges with weight less than the value of this parameter). With a LevelFilter, a slider on a visualization can be tied directly into the Filter, and thus can allow the user to control this parameter directly, and generate a dynamically changing graph.

Detailed documentation and sample filtering code can be found within the Javadoc for those classes. In particular, look at the package-level Javadoc for the class.


Visualization

The JUNG visualization package provides mechanisms for laying out and rendering graphs. The current renderer implementations use the Java Swing API to display graphs, but these may be implemented using other toolkits.

In general, a visualization is accomplished with

Thus, by selecting one of each of these three, it is possible to coordinate drawing. The default implementation traverses the Layout, asking it for locations of vertices, and then paints them individually with the Renderer inside the Swing component. In addition, the GraphDraw infrastructure simplifies many of these transformations by packaging the VisualizationViewer, the Renderer, and the Layout together. Users may then customize this viewer as appropriate. (Sample code is available in the GraphDraw documentation.)

The current implementation supports only 2D layout algorithms of (binary) Graphs; it does not support visualization of other graph types (such as hypergraphs).

This package, and the visualization.graphdraw package, also includes utilities and support classes that facilitate customization of a graph visualization. For instance, see the notes at FadingVertexLayout for a mechanism that can be used to create fading effects when vertices are filtered out and subsequently restored.


Utilities

JUNG includes a number of utility convenience classes, including:


Sample Code

The JUNG distribution includes several examples of code whose purpose is to show users how to perform a few simple tasks using JUNG. These examples include: