Up until a relatively few years ago, when it became possible to read the genetic code in DNA easily, the process by which the genetic information in DNA was used to produce proteins in cells seemed fairly simple. In a nutshell, an enzyme called transcriptase would "read" the portion of DNA corresponding to a particular gene and produce a transcribed copy in the form of messenger RNA, or mRNA. The mRNA, in turn, would be "read" by a protein complex known as the ribosome to construct a protein.
However, that's a substantial oversimplification. For one thing, there isn't a 1-to-1 correspondence between genes and proteins in higher animals such as mammals. It isn't known how many different proteins there are in a typical mammal, but humans might have several hundred thousand. Yet a surprising finding from sequencing of the human genome is that there are only about 25,000 genes. Furthermore, genes themselves are not simple in structure. It has been known for some time that most genes of eukaryotic cells consist of multiple segments called exons and introns. The raw transcript of a gene contains the information from both exons and introns, and the resulting RNA is called pre-mRNA. This pre-mRNA then undergoes an editing process called splicing, performed by a "spliceosome", in which the segments corresponding to introns are removed. The strand of "mature" mRNA that results is the mRNA that is used as a template to construct proteins. It is still mostly unknown why this unused information from introns is present at all.
And if that weren't enough complexity, it turns out that a gene can be spliced in several different ways, which involve only some of the gene's original exons in different combinations. This process is called alternative splicing. The set of all possible mRNA that can be produced from an organism's genome is called the transcriptome.
Research just published now reveals that there isn't even a simple relationship between the transcriptome and the resulting set of proteins (the "proteome"). It appears that there can be many mRNA strands which are not used as templates to make proteins.
There are several papers in the September 2 issue of Science (subscription required) that report on this research. A description of one of the papers is here: Mouse Genome Much More Complex Than Expected.
An international research team consisting of more than 100 scientists has been attempting since then to isolate and analyse the entire mRNA transcripts in the mouse. Their most astonishing finding is that more than 60 per cent of all mRNAs are not protein blueprints at all. 'We don't know what the function of these RNAs is,' the Bonn neurobiologist Professor Andreas Zimmer admits. However, they seem to be extremely important: even in such different organisms as hens and mice these ostensibly so unimportant RNAs are very similar. If they really had no function they would have mutated during the course of evolution so quickly that there would nowadays be hardly any similarity between them.
Research indicates that mice have about 180,000 types of mRNA in their transcriptome. If the majority of this mRNA isn't a protein template, what is it for? According to the Wikipedia transcriptome article (based on the research),
A study of 158,807 mouse transcripts revealed that 4520 of these transcripts form antisense partners that are base pair complementary to the exons of genes. These results raise the possibility that significant numbers of "antisense RNA-coding genes" might participate in the regulation of the levels of expression of protein-coding mRNAs.Antisense RNA is mRNA that has been transcribed from the strand of DNA that is complimentary to the DNA strand which contains the original gene. This means that a piece of antisense mRNA can attach itself to its complimentary strand and block it from being used to make a protein. Consequently, such antisense mRNA probably plays a role in how strongly the original gene is expressed.
Labels: molecular biology
Links to this post: