Bioinformatics & Mac OS X

This was a page now change it to post. I may no longer update it. It hasn’t been updated for 2 years anyway.

(This page is NOT finished, still updating, DRAFT, follow the instruction at your own risk.)

How to do Bioinformatics on Mac OS X? This is ab easy introduction for people need to do bioinformatics but their labs has no strong IT support. I will basically introduce some tools for you, but may not tell you how to use them, and also will not cover every aspects of bioinformatics. I will focus more on sequence manipulation, inferring phylogeny and recombination test.

Before we start we need to install the following items, please install them in order:

  • Xcode from Apple. It is available from your Mac OS X 10.4 installation CD or you can download from ADC. Do the default installation and you will have things required for building programs, such as gcc, make and so on.
  • Fink. Fink is a project that brings open source software to Darwin and Mac OS X. EMBOSS, clustalw and other scientific programs are available via Fink. You need to refer to Fink documentation to learn how to install these programs from source code.

Essential Tools

  • BioX
  • Clustal X
  • Muscle

Phylogenetic Tree Building

  • Lasergene‘s MegAlign: commercial. Native carbon or cocoa program for Mac OS X. Not universal binary yet but can run OK on Intel Macs. Easy to use GUI. My experience is that the neighbour joining tree it builds has similar topology as Maximum Likelihood tree PAUP* produces. It is commerical but your institute might have already purchased a licence.
  • PAUP*: commercial. The widely considered best program for phylogenetic analysis. Because this program hasn’t been updated for a while, the only GUI version is for Mac OS 9 or Classic. Unix, Windows and Intel Mac versions available but command line based. My experience is, OS 9 version is the best because command line version sometimes quit because of some sort of error, when executing PAUP block, while OS 9 version handles the PAUP block perfectly.
  • MrBayes: free. Use Baysian statistics to build phylogenetic tree. Command line. Open source and you can compile for Intel Mac yourselves. I have another Blog entry talking about how to build MrBayes for Intel Macs because you have to modify the Makefile little bit. This program is very slow and uses a lot of memory especially when you have large dataset.
  • Modeltest: free. Use this program to estimate evolution model together with PAUP* when building Maximum Likelihood trees. Here is a OK step by step how-to. If you have Xcode installed, you could simply compile this progran for your Intel or PPC based Mac. I suggest you modify Makefile to have CFLAGS option -O2 instead of -O.

Phylogenetic Tree Viewing and Printing (esp. for PAUP* and MrBayes‘ .nexus or .tre trees)

  • TreeView and TreeView X: free. TreeView is a program for Mac OS X 9 or Classic. TreeView X can run in Mac OS X. They all have Linux/Unix and Windows versions. TreeView seems to be more compatible with PAUP* than TreeView X, especially with rooted trees.
  • Besides the above specialized tree view and print programs, the following programs can also view compatible trees and print them:
  • Geneious: commercial and free. Based on Java and designed to be a suite for phylogenetic analysis, data management and so on. It is able to print tree on a page of paper, export tree as .png images.
  • SplitsTree: free. Based on Java and designed to do phylogenetic analysis, building phylogenetic network and detect recombination and so on. Can open nexus format trees, and export as vector graphics, for instance, svg format.
  • PowerPoint 2004 for Mac: commercial. In some of the programs, you can actually copy and paste your tree into PowerPoint. Sometimes these tree will be vector in PowerPoint and when you ungroup the tree file, you would be able to edit the tree in PowerPoint. Because PowerPoint is now one of the standard program to create poster, and some journal accept graphics in PowerPoint format, this is also a good way of presenting your trees.

Other Softwares

  • SITES: free and open source. SITES is a program for the analysis of comparative DNA sequence data. Basic analysis include: data summaries by polymorphism class; polymorphism estimates within and between groups etc. Avaliable for Windows and Classic PPC Mac. You can compile from source for Intel Macs.

1 comment

  1. I have been looking looking around for this kind of information. Will you post some more in future? I’ll be grateful if you will.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.