Long ORF cloning

Working with long ORF (Open Reading Frame) sequences.

For this project, our customer aimed to over-express a long ORF (long protein), named here as Protein A, consisting of 2774 Amino-Acids (AA). Moreover, Protein A was to be fused with a fluorescent protein (eGFP; 238 AA), resulting in an ORF of 3012 AA. The DNA sequence to be included in a transient expression vector was over 9Kb (Vector A).

Not only such a large fragment can prove difficult to manipulate at the cloning stage, but also it had to be synthesized in first place. In fact, most gene synthesis providers who accept such orders come with a high invoice and a long delivery time, mostly because of the inherent limitations of gene synthesis chemistry.

Our solution was to assemble an expression vector integrating 3 mid-sized synthetic products instead of 1 very large product encoding the full ORF. This way, we were able to define as we wanted all the parts of Protein A, even by taking in account synthesis feasibility, and puzzling it back to the final construct (Vector B). We took advantage of the seamless property of our DNA assembly technology to reconstitute the desired ORF on demand and fuse it with eGFP within the same process.

This approach also allowed us to create simultaneously a second vector where the eGFP was fused C-terminally instead of N-terminally (Vector C). This was done re-using the exact same synthetic products, hence cutting, even more, costs and delays.

When used in the studied cellular models (Neuron cells), both vectors B and C triggered Protein A expression. However only the N-terminally fused construct localized in the cells similarly to the endogenous Protein A as shown in the figure below. C-terminally fused Protein A did not localize appropriately, probably due to disturbed interactions with other cellular partners. All in all, having tested both possibilities simultaneously confirms the validity of the N-Terminal fusion on the functional level and strengthens the scientific observation.

In this figure, Endogenous Protein A detected by immunostaining, physiologically colocalize with protein B (Left Panel). GFP-fused protein A (N-terminal fusion) over-expressed using vector B also colocalizes with Protein B, demonstrating the functional relevance of the construct