forked from biopython/biopython
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
245 lines (173 loc) · 9.25 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
Biopython README file
=====================
The Biopython Project is an international association of developers of freely
available Python tools for computational molecular biology.
Our user-centric documentation is hosted on http://biopython.org including
the main Biopython Tutorial and Cookbook:
* HTML - http://biopython.org/DIST/docs/tutorial/Tutorial.html
* PDF - http://biopython.org/DIST/docs/tutorial/Tutorial.pdf
This README file is intended primarily for people interested in working
with the Biopython source code, either one of the releases from the
http://biopython.org website, or from our repository on GitHub
https://github.com/biopython/biopython
This Biopython package is open source software made available under generous
terms. Please see the LICENSE file for further details.
If you use Biopython in work contributing to a scientific publication, we ask
that you cite our application note (below) or one of the module specific
publications (listed on our website):
Cock, P.J.A. et al. Biopython: freely available Python tools for computational
molecular biology and bioinformatics. Bioinformatics 2009 Jun 1; 25(11) 1422-3
http://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878
For the impatient
=================
To build and install Biopython, download and unzip the source code, go to this
directory at the command line, and type::
python setup.py build
python setup.py test
sudo python setup.py install
Windows users are instead recommended to use the installation packages provided
on our website, http://biopython.org
Python Requirements
===================
We currently recommend using Python 2.7 from http://www.python.org which
is the final version of Python 2. Early adopters are encouraged to try
Biopython under Python 3.3 as well.
Biopython is currently supported and tested on the following Python verions:
- Python 2.6, 2.7, 3.3 -- see http://www.python.org
This is the currently the primary development platform for Biopython.
- PyPy 1.9, 2.0, 2.1, 2.2 -- see http://www.pypy.org
Aside from modules with C code or dependent of NumPy, everything should
work. PyPy's NumPy reimplementation NumPyPy is still in progress.
- Jython 2.7 -- see http://www.jython.org
Aside from modules with C code, or dependent on SQLite3 or NumPy,
everything should work.
Please note that Biopython 1.62 was our final release to support Python 2.5
and Jython 2.5.
Dependencies
============
Depending on which parts of Biopython you plan to use, there are a
number of other optional Python dependencies - which can in general
be installed after Biopython.
- NumPy, see http://www.numpy.org (optional, but strongly recommended)
This package is only used in the computationally-oriented modules.
It is required for Bio.Cluster, Bio.PDB and a few other modules. If you
think you might need these modules, then please install NumPy first BEFORE
installing Biopython. The older Numeric library is no longer supported in
Biopython.
- ReportLab, see http://www.reportlab.com/software/opensource/ (optional)
This package is only used in Bio.Graphics, so if you do not need this
functionality, you will not need to install this package. You can install
it later if needed.
- matplotlib, see http://matplotlib.org/ (optional)
Bio.Phylo uses this package to plot phylogenetic trees. As with ReportLab,
you can install this at any time to enable the plotting functionality.
- networkx, see http://networkx.lanl.gov/ (optional) and
pygraphviz or pydot, see http://networkx.lanl.gov/pygraphviz/ and
http://code.google.com/p/pydot/ (optional)
These packages are used for certain niche functions in Bio.Phylo.
Again, they are only needed to enable these functions and can be installed
later if needed.
- rdflib, see https://github.com/RDFLib/rdflib (optional)
This package is used in the CDAO parser under Bio.Phylo, and can be installed
as needed.
- psycopg2, see http://initd.org/psycopg/ (optional) or
PyGreSQL (pgdb), see http://www.pygresql.org/ (optional)
These packages are used by BioSQL to access a PostgreSQL database.
- MySQLdb, see http://sourceforge.net/projects/mysql-python (optional)
This package is used by BioSQL to access a MySQL database.
Note that some of these libraries are not available for Jython or PyPy.
In addition there are a number of useful third party tools you may wish to
install such as standalone NCBI BLAST, EMBOSS or ClustalW.
Installation
============
First, **make sure that Python is installed correctly**. Second, we
recommend you install NumPy (see above). Then install Biopython.
Windows users should use the appropriate provided installation package
from our website (each is specific to a different Python version).
Installation from source should be as simple as going to the Biopython
source code directory, and typing::
python setup.py build
python setup.py test
sudo python setup.py install
Substitute `python` with your specific version, for example `python3`,
`jython` or `pypy`.
If you need to do additional configuration, e.g. changing the base
directory, please type `python setup.py`, or see the documentation here:
* HTML - http://biopython.org/DIST/docs/install/Installation.html
* PDF - http://biopython.org/DIST/docs/install/Installation.pdf
Testing
=======
Biopython includes a suite of regression tests to check if everything is
running correctly. To run the tests, go to the biopython source code
directory and type::
python setup.py test
Do not panic if you see messages warning of skipped tests::
test_DocSQL ... skipping. Install MySQLdb if you want to use Bio.DocSQL.
This most likely means that a package is not installed. You can
ignore this if it occurs in the tests for a module that you were not
planning on using. If you did want to use that module, please install
the required dependency and re-run the tests.
There is more testing information in the Biopython Tutorial & Cookbook.
Experimental code
=================
Biopython 1.61 introduced a new warning, `Bio.BiopythonExperimentalWarning`,
which is used to mark any experimental code included in the otherwise
stable Biopython releases. Such 'beta' level code is ready for wider
testing, but still likely to change, and should only be tried by early
adopters in order to give feedback via the biopython-dev mailing list.
We'd expect such experimental code to reach stable status within one or two
releases, at which point our normal policies about trying to preserve
backwards compatibility would apply.
Bugs
====
While we try to ship a robust package, bugs inevitably pop up. If you are
having problems that might be caused by a bug in Biopython, it is possible
that it has already been identified. Update to the latest release if you are
not using it already, and retry. If the problem persists, please search our
bug database and our mailing lists to see if it has already been reported
(and hopefully fixed), and if not please do report the bug. We can't fix
problems we don't know about ;)
* Old issue tracker: https://redmine.open-bio.org/projects/biopython
* Current issue tracker: https://github.com/biopython/biopython/issues
If you suspect the problem lies within a parser, it is likely that the data
format has changed and broken the parsing code. (The text BLAST and GenBank
formats seem to be particularly fragile.) Thus, the parsing code in
Biopython is sometimes updated faster than we can build Biopython releases.
You can get the most recent parser by pulling the relevant files (e.g. the
ones in `Bio.SeqIO` or `Bio.Blast`) from our git repository. However, be
careful when doing this, because the code in github is not as well-tested
as released code, and may contain new dependencies.
Finally, you can send a bug report to the bug database or the mailing list at
[email protected]. In the bug report, please let us know:
1. Which operating system and hardware (32 bit or 64 bit) you are using
2. Python version
3. Biopython version (or git commit/date)
4. Traceback that occurs (the full error message)
And also ideally:
5. Example code that breaks
6. A data file that causes the problem
Contributing, Bug Reports
=========================
Biopython is run by volunteers from all over the world, with many types of
backgrounds. We are always looking for people interested in helping with code
development, web-site management, documentation writing, technical
administration, and whatever else comes up.
If you wish to contribute, please visit the web site http://biopython.org
and join our mailing list: http://biopython.org/wiki/Mailing_lists
Distribution Structure
======================
- README -- This file.
- NEWS -- Release notes and news.
- LICENSE -- What you can do with the code.
- CONTRIB -- An (incomplete) list of people who helped Biopython in
one way or another.
- DEPRECATED -- Contains information about modules in Biopython that are
removed or no longer recommended for use, and how to update
code that uses those modules.
- MANIFEST.in -- Tells distutils what files to distribute.
- setup.py -- Installation file.
- Bio/ -- The main code base code.
- BioSQL/ -- Code for using Biopython with BioSQL databases.
- Doc/ -- Documentation.
- Scripts/ -- Miscellaneous, possibly useful, standalone scripts.
- Tests/ -- Regression testing code including sample data files.