The RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks) Toolbox 2 is a software suite for Matlab that allows for semi-automated reconstruction of genome-scale models (GEMs). It makes use of published models and/or KEGG, MetaCyc databases, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology.
If you are using RAVEN in any scientific work, please cite: R. Agren, et. al, “The RAVEN Toolbox and Its Use for Generating a Genome-scale Metabolic Model for Penicillium chrysogenum,” PLoS Comput. Biol., vol. 9, no. 3, p. e1002980, Mar. 2013..
A manuscript describing RAVEN Toolbox 2 is currently being prepared. Citation details will therefore be updated in the near future.
Please report any technical issues and bugs here. For other issues, please contact Eduard Kerkhoven.
Hidden Markov Models (HMMs) for KEGG reconstruction
RAVEN can be installed via cloning the GitHub repository as per below or by downloading and extracting one of the a zipped release. Please note that the releases do not always represent the most up to date version.
- A functional MATLAB installation (version 2013b or later).
- libSBML MATLAB API (version 5.15 or higher), which is utilised for importing and exporting GEMs in SBML format. Note: not needed if COBRA Toolbox is installed.
- At least one solver for linear programming:
- Preferred: Gurobi Optimizer (version 7.5 or higher), academic license is available here.
- Alternative/legacy: MOSEK (version 7 only), academic license is available here.
- If the user has COBRA Toolbox installed, it is possible to use the default COBRA solver (the one which is set by changeCobraSolver.
Obtain a RAVEN Toolbox in one of the following ways:
- In Terminal/Command Prompt, navigate to the desired installation directory and run the following Git command:
git clone [email protected]:SysBioChalmers/RAVEN.git
- Alternatively, download the latest release of RAVEN Toolbox as a ZIP file, and extracted to your favourite directory.
Once extracted, ensure that all other software dependencies (e.g. libSBML, Gurobi) are installed (see above for list, below for instructions. Then, open MATLAB and run the following command:
cd('[location]/RAVEN/installation'))
checkInstallation
where [location]
is the directory where you installed RAVEN.
This function checks the functionality for libSBML MATLAB API and solver software. It automatically recognises which solvers are installed and sets the first functional solver as the default RAVEN solver. The default RAVEN solver be changed any time by typing in Matlab:
setRavenSolver('solverName')
Available solver names are gurobi
, mosek
and cobra
.
In Unix-based systems checkInstallation also checks the consistency of external binary programs. If these binaries are broken, they need to be re-compiled from their corresponding source codes. See the documentation for the corresponding software for more details.
- Download libSBML from the link above and install to your favourite directory.
- In MATLAB, run the following command:
addpath('[location]/libSBML-5.x.0-matlab')
savepath
where [location]
is where you installed libSBML and 5.x.0
is your libSBML version (5.15.0 or higher).
- Download from the link above and install Gurobi to your favourite location.
- Make sure you obtained a license following instructions for Windows, Mac or Unix.
- To install Gurobi in MATLAB, follow the instructions for Windows, Mac or Unix.
- Make sure that MATLAB remembers the Gurobi installation for next time, by running the following command:
savepath
- Download from the link above and install Mosek to your favourite location.
- Make sure you obtained a license following instructions.
- To install Mosek in MATLAB, follow instructions. Note: the documentation mentions version 8, but RAVEN only works with version 7 of Mosek.
- Make sure that MATLAB remembers the Mosek installation for next time, by running the following command:
savepath
- To gain access to functions from COBRA Toolbox, follow installation instructions provided here.
- To use COBRA-specified solvers (e.g. open-source GLPK solver), configure COBRA and RAVEN with the following commands:
changeCobraSolver('glpk')
setRavenSolver('cobra')
Some tutorials highlighting basic RAVEN functionality can be found in the 'tutorial' folder in the installation directory.
Hidden Markov Models for KEGG based reconstruction
Provided are pre-trained Hidden Markov Models (HMMs) for KEGG Orthology (KO) protein sets:
For de novo reconstruction of a GEM, the RAVEN function getKEGGModelForOrganism can use HMMs trained on KO protein sets. Provided are HMMs trained on KEGG Release 82.0. CD-HIT was used to obtain non-redundant representative KO protein sets thereby clustering proteins with the defined identity and overlap with the longest protein in the corresponding cluster threshold values. Multisequence alignment with MAFFT and training with HMMER 3.1b2 were then performed. The provided archives contain only pre-trained HMMs.
HMM sets can be downloaded automatically during GEM reconstruction from KEGG (set the dataDir parameter in getKEGGModelForOrganism). Alternatively, download links are provided below. The following HMM sets are available:
dataDir | KEGG version | Phylogeny | Identity (%) | Overlap (%) |
---|---|---|---|---|
euk100_kegg82 | 82.0 | eukaryota | 100 | 90 |
euk90_kegg82 | 82.0 | eukaryota | 90 | 90 |
euk50_kegg82 | 82.0 | eukaryota | 50 | 90 |
prok100_kegg82 | 82.0 | prokaryota | 100 | 90 |
prok90_kegg82 | 82.0 | prokaryota | 90 | 90 |
prok50_kegg82 | 82.0 | prokaryota | 50 | 90 |
HMMs were trained from KO protein sets, based on KEGG Release 58.1. Multisequence alignment was performed with ClustalW2, whereas HMMs were trained with HMMER 2.3. All the associated proteins were used in multisequence alignment and HMMs training. In addition to pre-trained HMMs, the archives also contain multisequence alignment data. The following HMM sets are available:
Dataset | KEGG version | Phylogeny |
---|---|---|
eukaryota | 58.1 | eukaryota |
prokaryota | 58.1 | eukaryota |
Anybody is welcome to contribute to the development of RAVEN Toolbox, but please abide by the following guidelines.
When making any changes to an existing function (*.m
-file, change the name and date near the bottom of the commented section that is included the beginning of each function. In this section, please specify what each parameter is doing, and what are default settings. Have a look at existing functions to see what style is used.
- When fixing a bug in an existing function, make a separate branch from
master
and name the branch after the function you are fixing. - Make commits to this branch while working on your bugfix. Note that bugfixes have to be backwards compatible.
- When you are certain that your bugfix works, make a pull request to the
master
branch. Also, see Pull request below.
- For other development (not bugfixes, but for instance introducing new functions or new/updated features for existing functions): make a separate branch from
devel
and name the branch for instance after the function/feature you are fixing/developing. - Make commits to this branch while developing. Aim for backwards compatibility, and try to avoid very new MATLAB functions when possible, to accommodate users with older MATLAB versions.
- When you are happy with your new function/feature, make a pull request to the
devel
branch. Also, see Pull request below.
Use semantic commit messages to make it easier to show what you are aiming to do:
chore
: updating binaries, KEGG or MetaCyc database files, etc.doc
: updating documentation (indoc
folder) or explanatory comments in functions.feat
: new feature added, e.g. new function introduced / new parameters / new algorithm / etc.fix
: bugfix.refactor
: see code refactoring.style
: minor format changes of functions (spaces, semi-colons, etc., no code change).
Examples:
feat: exportModel additional export to YAML
chore: update KEGG model to version 83.0
fix: optimizeProb parsing results from Gurobi
More detailed explanation or comments can be left in the commit description.
- No changes should be directly commited to the
master
ordevel
branches. Pull requests should be used. - The person making the pull request and the one accepting the merge cannot be the same person.
- Typically, wait ~ 1 week before merging, to allow for other developers to inspect the pull request.
- A merge with the master branch typically invokes a new release (see versioning).
RAVEN Toolbox follows semantic versioning, and a version.txt
file is updated with each release of the master branch.
For more systems biology related software and recently published genome-scale models from the Systems and Synthetic Biology group at Chalmers University of Technology, please visit the Github page. For more information and publications by the Systems and Synthetic Biology please visit SysBio.