Biopython

Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.

It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics.

Policy

The source code is made available under the Biopython License, which is extremely liberal and compatible with almost every license in the world.

Citations

Biopython has a separate list of publications citing or using Biopython. If you use Biopython in a scientific publication, please cite the application note (Cock et al., 2009) (and/or one of the module specific papers).

Overview

Biopython is a collection of (non-commercial) Python tools for computational biology and bioinformatics.

  • It contains classes to represent biological sequences and sequence annotations
  • It can read and write to a variety of file formats
  • It allows for a programmatic means of accessing online databases of biological information (NCBI etc.)
  • Separate modules extend Biopython’s capabilities to sequence alignment, protein structure, population genetics, phylogenetics, sequence motifs, and machine learning.

A longer description of Biopython and how to use it can be found in the Biopython documentation, particularly there is a Quick Start with Biopython in the Biopython Tutorial and Cookbook

Biopython at HPC2N

On HPC2N we have Biopython available as a module on Kebnekaise. To see the available versions, login to Kebnekaise and do ml spider biopython.

Usage at HPC2N

To use, load the Biopython module to add it to your environment. You give this command to see how to load Biopython and its prerequisites:

ml spider Biopython

and to see how to load a specific module, including the prerequisites, do:

ml spider Biopython/<version>

Additional info