Introduction to Bioinformatics
Bioinformatics seeks to analyze large sets of biological data in order to solve biological questions, to formulate hypotheses and to build models of underlying biological processes involved.
Applications of bioinformatics
- Medicine
- Research
- Pharmaceutical
- Biotechnology
see also
Definitions, Glossaries, and Dictionaries
see also
Recommended Reading
The tremendous interest in bioinformatics, a new discipline at the
intersection of molecular biology and computer science, is fueled by the
excitement surrounding the sequencing of the human genome and the promise
of a new era in which genomic research dramatically improves the human
condition. Advances in detection and treatment of disease and the
production of genetically engineered foods are among the most often
mentioned benefits. Bioinformatics is a fertile new area for programmers.
As the eminent computer scientist Donald Knuth is often quoted as saying:
"Biology easily has 500 years of exciting problems to work on" (Doernberg 1993).
The National Center for Biotechnology Information (NCBI
2001) defines bioinformatics as:
"Bioinformatics is the field of science in which biology, computer
science, and information technology merge into a single discipline...There are
three important sub-disciplines within bioinformatics: the development of new
algorithms and statistics with which to assess relationships among members of
large data sets; the analysis and interpretation of various types of data
including nucleotide and amino acid sequences, protein domains, and protein
structures; and the development and implementation of tools that enable efficient
access and management of different types of information."
Damian Counsell's Bioinformatics FAQ (2001) puts
it more simply. "I would say most biologists talk about 'doing
bioinformatics' when they use computers to store, retrieve, analyze or
predict the composition or the structure of biomolecules. As computers
become more powerful you could probably add simulate to this list of
bioinformatics verbs. 'Biomolecules' include your genetic
material---nucleic acids---and the products of your genes: proteins."
While the terms bioinformatics and computational biology are often used
interchangeably, medical informatics is another field entirely. "Medical
informatics generally deals with 'gross' data, that is information from
super-cellular systems, right up to the population level, while
bioinformatics tends to be concerned with information about cellular and
biomolecular structures and systems." (Counsell
2001)
For more information, see the Definitions,
Glossaries and Dictionaries and the Recommended
Reading sections of this guide.
see also
Introduction to Bioinformatics
see also
Guides, Tutorials and Primers
Definitions
A quick review of the basic genetic terms and concepts will help in understanding the sequence databases. The NCBI Genetics Review site is highly recommended reading since it provides a particularly good overview of the concepts as well as listing some good references for additional information ({
http://www.ncbi.nlm.nih.gov/Class/MLACourse/Original8Hour/Genetics/}). The following terms are central to understanding bioinformatics:
- Nucleotide:
- One of the structural components, or building blocks, of DNA and RNA. A nucleotide consists of a base (one of four chemicals: adenine, thymine [uracil instead of thymine for RNA], guanine, and cytosine) plus a molecule of sugar [ribose for RNA, deoxyribose for DNA] and one of phosphoric acid .
- Gene:
- A length of DNA which codes for a particular protein, or in certain cases a functional or structural RNA molecule (from PhRMA Genomics Lexicon {http://genomics.phrma.org/lexicon/}).
Less than 5% of the human genome codes for genes. The rest are non-coding sequences which may have other functions.
- Genome:
- The complete gene complement of an organism, contained in a set of chromosomes (in eukaryotes), in a single chromosome (in bacteria), or in a DNA or RNA molecule (in viruses) (from Academic Press Dictionary of Science and Technology {http://www.harcourt.com/dictionary/}).
- Genomics:
- Operationally defined as investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion (from What is Genomics? {http://www.genomecenter.ucdavis.edu/what.html}). Genetics looks at single genes, one at a time, as a snapshot. Genomics is trying to look at all the genes as a dynamic system, over time, and determine how they interact and influence biological pathways and physiology, in a much more global sense (from Basic Genetics & Genomics http://www.genomicglossaries.com/content/Basic_Genetic_Glossaries.asp).
- Proteome:
- The complement of proteins expressed by an organism, tissue or cell type (from Proteomes and Proteomics). The concept of the proteome is fundamentally different to that of the genome: while the genome is virtually static and can be well defined for an organism, the proteome continually changes in response to external and internal events.
- Proteomics:
- The study of the full set of proteins encoded by a genome. The characterisation of patterns of gene expression at the protein level or the link between proteins and genomes. Proteomics encompasses many different approaches to protein study, from bioinformatics of protein content of genomes to large scale direct protein analysis of complicated protein mixtures, and the definition of a protein's properties, their interactions and modifications (from Proteomes and Proteomics {http://www.mrc-dunn.cam.ac.uk/pages/proteomes.html}).
Glossaries and Dictionaries
- Science Magazine: Functional Genomics Resources: "Finding the right
word: A guide to some useful online glossaries" Post-genomics, biotech and
bioinformatics - {http://www.sciencemag.org/feature/plus/sfg/education/glossaries.dtl#postgenomics}
- An excellent selective list, ranked by the site's editors, of the ten "best"
online glossaries. See also glossaries on related topics at this site.
- Access Excellence Graphics Gallery - http://www.accessexcellence.org/AB/GG/
- "Graphics Gallery is a series of labeled diagrams with explanations representing the important processes of biotechnology. Each diagram is followed by a summary of information, providing a context for the process illustrated."
- Genomics Glossary - http://www.genomicglossaries.com/
- Actually a collection of several glossaries and taxonomies, including a
Bioinformatics Glossary at http://www.genomicglossaries.com/content/Bioinformatics_gloss.asp.
The Scout Report and Science Magazine give this resource very high praise, but this
author found the site to be cluttered and difficult to navigate, although the
content is very good.
- Human Genome Project Information Glossary - {http://www.ornl.gov/sci/techresources/Human_Genome/glossary/}
- A useful glossary of genetics terms from the DOE Human Genome Program
that you can both browse and search.
- National Human Genome Research Institute (NHGRI) Glossary of Genetic Terms - {http://www.genome.gov/glossary.cfm}
- This is sometimes called the "talking glossary" since audio clips
allow you to hear definitions and longer explanations given by an expert.
Try it with the word "nucleotide." Illustrations are also sometimes
available.
- PhRMA Genomics Lexicon - {http://genomics.phrma.org/lexicon/}
- This extensive glossary is sponsored by the Pharmaceutical Research and
Manufacturers of America. Also provides links to other dictionaries and glossaries.
- Southwest Biotechnology and Informatics Center (SWBIC): News - {http://www.nbif.org/links/1.20.php}
- Annotated directory of news sites, many focusing in bioinformatics
(scroll down past the long table of contents to see the content). A good
"launch pad" to news sites.
- Genomics Today - {http://genomics.phrma.org/today/}
- A daily headline news service that provides links to genomics news in other
sites. It culls the relevant headlines from a wide variety of sources including
wire services, newspapers, Yahoo, selected web sites, and university news sites.
Sponsored by the Pharmaceutical Research and Manufacturers of America.
- GNN: Genome News Network - {http://www.genomenewsnetwork.org/index.php}
- Good source for news on scientific, as opposed to business, aspects of
bioinformatics. Bioinformatics news is clearly marked. The short news summaries are
to be commended for giving the full citation to the original scientific article at
the end of each news piece. In addition to news there are also featured articles
and a few educational links.
- The Scientist -
http://www.the-scientist.com/
- Frequent coverage of bioinformatics news. Registration is free, after which you
will automatically be sent via e-mail the tables of contents for each biweekly
issue.