Before explaining DNA transcription let’s remember that, life has evolved so the pattern of gene expression in a given cell would be the result of the integration of many signals-in to trigger the appropriate cellular response to the changes of its environment. No wonder that DNA transcription regulatory networks ended to be so complex and sophisticated.
Of course, not every life form has evolved similarly, and consequently transcription mechanisms are slightly different according to the taxonomic group. There are several bibliographic resources, most of them available online, that detail the transcription mechanisms and our goal here is not to duplicate such knowledge. Instead, we will summarize below some generic and specific elements to take into consideration during the design of gene expression vectors.
Generally speaking, transcription is a multistep process comprising initiation, elongation and termination steps:
DNA transcription definition:
DNA Transcription initiation consists in recruiting a multi-protein complex known as RNA polymerase (RNAP) to bind the DNA molecule upstream (in a 5’->3’ orientation) of the transcription start site (TSS or +1). RNAPs are recruited by general transcription factors that are DNA binding proteins, that recognize specific DNA sequences that can be considered as the actual transcription promoter. Additional proteins can positively (activators) or negatively (repressors) regulate the RNAP recruitment.
These regulators often are DNA-binding protein themselves that recognize specific sequences nearby the promoter site. These sites are known as transcription factor sites. DNA region containing activators site are described as enhancers. At last, other proteins can also bind to activators or repressors to potentiate their effect and will be described as co-activators and co-repressors. (schema à produire). Co activators or Co-repressors can also be biochemical compounds.
In prokaryotes, general transcription factors are called sigma factors. Several sigma factors co-exist in a given cell and will specifically initiate transcription of various genes according to the environmental signals.
Specific domains of sigma factors are described to bind DNA sequences at specific positions within the promoter, notably the promoter −10 element (called the « Pribnow box ») and the promoter −35 element (-10 and -35 referring to the distance to the +1 start site)1.
When designing an expression vector to be amplified in bacteria, one may consider that -10 and -35 elements could be randomly present in exogenous sequences (e.g. the transgene of interest), thus triggering unexpected transcription in the bacteria cell, with possibly adverse outcome for the vector amplification. For example, unwanted expression of a human protein in bacteria may cause toxicity and loss of the amplification culture.
In eukaryotes, there are three disctinct RNAPs, namely RNA Pol I, RNA Pol II and RNA Pol III. Each of them is involved in the transcription of specific pools of genes. RNA Pol I is specific of ribosomal RNAs (rRNAs) transcription. RNA Pol II is specific of messenger RNAs (mRNAs), small nuclear RNAs (snRNAs), small interfering RNAs (siRNAs), micro RNAs (miRNAs) and long non-coding RNAs (lncRNAs). RNA Pol III is specific of short RNAs such as transfer RNAs (tRNAs), the 5S rRNA and the U6 spliceosomal snRNA.
The initiation of transcription is historically best described for the RNA Pol II system, but recent studies nicely described similarities and differences in the initiation of RNA Pol I 2 and III 3 systems.
Broadly, all three systems resemble to what happens in bacteria in the fact that pre-initiation complexes are formed when general transcription factors are bringing RNAPs to the vicinity of TSS by recognizing and binding core promoter elements. The nature of the transcription factors, recognized sequences and modes of interactions between the molecular partners do vary, as a reflection of the evolutive specialization of each complex.
Multiple RNAPs can transcribe the same gene simultaneously. The number of simultaneous transcriptions is linked to the speed of recruitment of RNAPs at the initiation complex, which is dependent on the strength of the promoter/enhancer regions.
DNA Transcription termination occurs when the transcribed RNA is released from the RNA-DNA-RNAP complex. As for initiation step, there are distinct mechanisms described for either prokaryotes or eukaryotes organisms.
In Rho-independent terminators, the formation of a self-annealing hairpin structure on the elongating transcript is responsible for the RNAP destabilization. Additionally, the NusA protein, recognizing the hairpin, can potentiate the termination.
As a consequence, the transcriptional complex continues to progress along the DNA molecule and transcribe RNA during several hundreds to some thousands of base pairs. The resulting RNA molecule, lacking the protective 5’cap is degraded by exonucleases. Eventually the RNAP complex will detach from the DNA molecules, using a mechanism still poorly understood.