How is Gene expression regulated?
Genome is physically present, under the form of chromosome(s) in every cell, or almost every cell in multicellular organisms. Yet, not every gene is equally active at a single time in any given cells. Proteins, the effector of metabolism will be ‘expressed’ from the genome in a timely regulated manner so the cell can ‘behave’ accordingly to its environment.
For example, in Human, it has been estimated that about 6.000 genes are equally expressed in every cell. Those are called ‘housekeeping genes’ This means that what makes a bone cell different from a brain or a muscle one lies in the differential expression of other genes. It has been estimated that a given cell express simultaneously 30 to 60% of its genome, meaning 4 to 14 thousand genes are differentially regulated to determine cellular fate.
Let us focus on transcription.
Using the genomic information to ‘express’ protein is a multistep process. First, the coding sequence of a gene is transcribed on a messenger molecule to be used as template for the protein biosynthesis. The chemical nature of the messenger molecule (ribonucleic acid – RNA) is close to that of the genomic one (Deoxyribonucleotide). The information is still encoded in the sequence of the nucleotide, but the ‘alphabet’ of RNAs is different from one ‘letter’ since T nucleotides are replaced bar U ones. Also, RNAs a single strand molecules, while genomic DNA is double stranded. The messenger RNA, is now a new template, recognized by ribosomes which will translate the nucleotide sequence into a protein one, using the genetic code.
In a general way, there is a correlation between the level of transcription of a gene (the numbers of copies of mRNAs present in a cell) and the amount of protein molecule corresponding to that gene. For example Actin and Myosin being two important molecules in muscle cells, mRNAs coding Actin and Myosin are more abundant in these cells.
Now, how does a cell physically regulate the amounts of mRNAs it produces for each active gene? Well, it will use proteins. The actual transcription step, meaning making a RNA copy of a DNA molecule, is performed by proteins called RNA polymerases. These proteins can read the DNA sequence, and biosynthesize a mirror copy of it using base pair complementarity (A/U and C/G) (see transcription detail here). RNA polymerases will be helped by several other proteins, some to ‘open’ the DNA molecule so it can be read and copied, some to check the copy is ok or to sometimes correct it if it is not and some to actually bring the RNA polymerases to the right place to start transcription. These later proteins are crucial to prioritize the gene expression in a given cell. They are called ‘transcription factors’.
Transcription factors are protein with the double ability to bind DNA on specific nucleotide sequences, and to recruit other proteins in complexes that will ultimately bring RNA polymerases to transcribe the downstream DNA sequence. About 10% of the genes encode DNA-binding proteins. The nature and the level of expression of these proteins being responsible for the integrated regulation of gene expression in response to the environment.
This intrinsically means that DNA molecules must display specific sequences, corresponding to such or such transcriptions factors, in order to operate gene expression in a regulated way. These sequences are called transcription binding sites. Several of such sites can be grouped in particular DNA region upstream of coding sequences, rendering its transcription regulation quite complex and multi-parametric. The DNA region regrouping transcription binding sites is called Promoter region.
Understanding how promoter regions work in a given cell or organism is one crucial key to understand fundamental biology. There is therefore a lot of research effort dedicated to this task. Once a promoter region has been characterized, and its function described and understood, it may become a useful tool to drive gene expression in artificial DNA molecules such as gene expression vectors. Indeed, most of the promoters used in genetic engineering are derived from naturally occurring sequences and integrated in expression vectors, so to recreate some artificial gene structure compatible with the biological system to manipulate. For example, scientists will use bacterial promoter regions to artificially express genes in bacteria, and yeast promoters to artificially express genes in yeast.