Roughly, bioinformatics describes any use of computers to handle biological information.

In practice, the definition used by most people is narrower; bioinformatics to them is a synonym for “computational molecular biology”—the use of computers to characterize the molecular components of living things.

What is Bioinformatics?—The Tight Definition

“Classical” bioinformatics

Most biologists talk about “doing bioinformatics” when they use computers to storeretrieveanalyze or predict the composition or the structure of biomolecules. As computers become more powerful you could probably add simulate to this list of bioinformatics verbs. “Biomolecules” include your genetic material—nucleic acids—and the products of your genes: proteins. These are the concerns of “classical” bioinformatics, dealing primarily with sequence analysis.

Fredj Tekaia at the Institut Pasteur offers this definition of bioinformatics:

“The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information.”

It is a mathematically interesting property of most large biological molecules that they are polymers; ordered chains of simpler molecular modules called monomers. Think of the monomers as beads or building blocks which, despite having different colours and shapes, all have the same thickness and the same way of connecting to one another.

Monomers that can combine in a chain are of the same general class, but each kind of monomer in that class has its own well-defined set of characteristics.

Many monomer molecules can be joined together to form a single, far larger, macromolecule. Macromolecules can have exquisitely specific informational content and/or chemical properties.

According to this scheme, the monomers in a given macromolecule of DNA or protein can be treated computationally as letters of an alphabet, put together in pre-programmed arrangements to carry messages or do work in a cell.

“New” bioinformatics

The greatest achievement of bioinformatics methods, the Human Genome Project, is currently being completed. Because of this the nature and priorities of bioinformatics research and applications are changing. People often talk portentously of our living in the ” post-genomic” era. My personal view is that this will affect bioinformatics in several ways:

  • Now we possess multiple whole genomes we can look for differences and similarities between all the genes of multiple species. From such studies we can draw particular conclusions about species and general ones about evolution. This kind of science is often referred to as comparative genomics.
  • There are now technologies designed to measure the relative number of copies of a genetic message (levels of gene expression) at different stages in development or disease or in different tissues. Such technologies, such as DNA microarrays will grow in importance.
  • Other, more direct, large-scale ways of identifying gene functions and associations (for example yeast two-hybrid methods) will grow in significance and with them the accompanying bioinformatics of functional genomics.
  • There will be a general shift in emphasis (of sequence analysis especially) from genes themselves to gene products. This will lead to:
    • attempts to catalogue the activities and characterize interactions between all gene products (in humans): proteomics ).
    • attempts to crystallize and or predict the structures of all proteins (in humans): structural genomics.
    • fewer DNA double-helices in bad sci-fi movies.
  • What some people refer to as research or medical informatics, the management of all biomedical experimental data associated with particular molecules or patients—from mass spectroscopy, to in vitro assays to clinical side-effects—will move from the concern of those working in drug company and hospital I.T. (information technology) into the mainstream of cell and molecular biology and migrate from the commercial and clinical to academic sectors.

This FAQ concentrates on classical bioinformatics, but will, I hope, grow to cover more of the “post-genomic” aspects of the field. It is worth noting that all of the above non-classical areas of research depend upon established sequence analysis techniques.

Bioinformatics Books

I’ve divided suggested reading into books of general interest, those best suited to people coming from a computational/mathematical background and books for biologists interested in bioinformatics. Links to other lists of bioinformatics books follow this section of suggested reading.

General Introductions

Computational/Mathematical aspects

Applying bioinformatics in biological research