< return home
Banjo

Banjo: Bayesian Network Inference with Java Objects

Banjo is a software application and framework for structure learning of static and dynamic Bayesian networks, developed under the direction of Alexander J. Hartemink in the Department of Computer Science at Duke University. Banjo was designed from the ground up to provide efficient structure inference when analyzing large, research-oriented data sets, while at the same time being accessible enough for students and researchers to explore and experiment with the algorithms. Because it is implemented in Java, the framework is easy to maintain and extend.

Banjo focuses on score-based structure inference (a plethora of code already exists for variable inference within a Bayesian network of known structure). Available heuristic search strategies include simulated annealing and greedy hill-climbing, paired with evaluation of a single random local move or all local moves at each step. A search algorithm in Banjo is assembled as a set of core components for handling various subtasks:

  • Proposing a new network (or networks), handled by a “proposer” component,
  • Checking the proposed network(s) for cycles when necessary, handled by a “cycle checker” component,
  • Computing the score(s) of the proposed network(s), handled by an “evaluator” component, and
  • Deciding whether to accept a proposed network, handled by a “decider” component.

These core components are organized and implemented in such a way that they can be used to study or extend the search algorithms themselves: a set of easily expandable statistics is provided for monitoring the actual search process.

The core algorithms assume and have been optimized for discrete variables, but if some of your variables are continuous, the current version of Banjo provides simple discretization functionality using either quantile or interval discretization methods. Any number of highest scoring networks can be retained in the search, and these networks can be posterior averaged to produce a weighted "consensus" network. The single highest scoring network can be processed by Banjo to compute influence scores on the edges, or to generate a file formatted for rendering with dot, a graph layout visualization tool developed by AT&T.

To speed up inference even more, Banjo 2.1 added the ability to use multiple threads, and Banjo 2.2 added the further ability to perform parallel search on a large cluster of machines.

The current version of Banjo is 2.2.0.

Licensing Overview

You may license Banjo either under a non-commercial use license or under a specially-negotiated non-exclusive commercial use license. You may choose which type of license is more appropriate for your needs. For strictly non-commercial use of the software, you may prefer to license the software under the non-commercial use license. The term ‘commercial use’ is defined broadly: if the software is used for commercial gain or to further any commercial purpose, a commercial use license is required. If you have any question about whether your use would be considered commercial, or if you would like to negotiate a non-exclusive commercial use license, please contact us.