Phenotype Microarray data analysis in BioPython

Everyone who does computational biology and has wrote at least one Python script probably knows about the BioPython library. I personally remember going “Oh!” some years ago when I gave up writing my own (horrible, terrible, clunky) GenBank file parser and discovered it. Since then it has been a central part of almost all small scripts I needed to write. Recent versions have become even more useful, with the inclusion of a very cool KEGG API wrapper, which has the side-effect of putting together two well-designed bioinformatics software together!

It is then with great pleasure that I’m announcing the addition of the Bio.phenotype module to BioPython, starting from version 1.67. The module allows to parse and write the outputs of Phenotype Microarray experiments, as well as to run some simple analysis on the raw data. Even though I have published another software in the past to run the same analysis (plus some more), I thought that a simpler library would prove useful for many, and that having it as part of BioPython would make it more easily accessible. Moreover, from a software development perspective it is worth noting that BioPython is following very strict practices, which ensure that code is properly written, tested and maintained. This is all possible thanks to the great work of the BioPython community in general and of Peter Cock in particular.

Example PM well

An example well from a Phenotype Microarray experiment, showing the parameters estimated by the Bio.phenotype module (code is in the tutorial at the end of the post).

So, I really hope that this small library will prove useful to anyone with Phenotype Microarray CSV files collecting dust in their filesystem. To make it even easier I’ve posted a small tutorial here, which also includes some downstream analysis and plots that are not covered in the BioPython manual (the tutorial is also embedded at the end of this post).

Happy analysis!