In a recent study posted to the bioRxiv* preprint server, researchers presented the intragenomic rearrangements of β-coronaviruses (CoVs), including severe acute respiratory syndrome CoV 2 (SARS-CoV-2).

Observed genomic variation across distant or closely associated CoVs includes recombination, insertions, deletions, and point mutations. Through these variations, the CoVs demonstrate significant genomic resilience and plasticity. The higher incidence of recombination has been attributed to the usage of ribonucleic acid (RNA)-dependent RNA polymerase (RdRp)-triggered a template-switching process in CoVs for transcription and regulation of structural and supplementary gene expression. 

Study: Intragenomic rearrangements of SARS-CoV-2 and other β-coronaviruses. Image Credit: NIAID

About the study

In the present study, the researchers reported several genomic insertions of 5′-untranslated region (UTR) sequences into the coding areas of SARS-CoV-2 and other β-CoVs. For this, the team employed 5′-leader nucleotide (i.e., TRS-L) and amino acid sequences summarized in the three reading frames as search terms.

Thereby, they outlined the presence of intragenomic rearrangements encompassing segments of the 5′-leader sequences in temporally and geographically diverse isolates of SARS-CoV-2.


The results indicated that every translocated 5’-UTR nucleotide sequence contains TRS-L in varying levels of stem-loop 2 (SL2) and SL3. The presence of TRS-L might impact the nucleocapsid (N) gene expression. This N gene was located close to the ORF8 gene, and all insertions modified the carboxyl-terminus of ORF8. In some isolates, these insertions caused additional modifications such as insertions, deletions, and point mutations. A comparable insertion at the ORF8’s carboxyl-terminus was observed in five Sarbecovirus β-CoVs obtained from Rhinolophus bats inhabiting Indochina, Southwest China, and England. 

The crystal structure of SARS-CoV-2 ORF8 revealed an approximately 60-residue core comparable to that of the SARS-CoV-2 ORF7a. However, two dimerization interfaces, one noncovalent and another covalent, were unique to the SARS-CoV-2 ORF8. I121, F120, D119, and R115 attribute to the covalent dimer interface at the ORF8’s C-terminus modified by 5′-UTR-derived insertions. Further, R115 and D119 produce salt bridges surrounding a central hydrophobic core, where V117 and its symmetry-related counterpart interacted. The alterations induced by insertions might result in SARS-CoV-2’s immune evasion via influencing ORF8 interaction with intracellular transport signaling and ultimately causing SARS-CoV-2-related cytokine storm.

Around 5% of the CD4+ T cells were ORF8-specific in the majority of the SARS-CoV-2 cases. Additionally, ORF8 was responsible for 10% of the CD8+ T cell reactivity in SARS-CoV-2-recovered individuals. Anti-ORF8 antibodies were seen in asymptomatic and symptomatic SARS-CoV-2 patients during the early phase of infection, and diagnostic techniques for SARS-CoV-2 infection, which only target auxiliary proteins or genes like ORF8, might be impacted.

In two SARS-CoV-2 isolates, a shorter fragment of the SARS-CoV-2 5′-UTR leader sequence than that reported for ORF8 insertions was replicated and translocated to the ORF7b end. One SARS-CoV-2 isolate had a truncated ORF8 gene, and the other had a truncated ORF7b. Although the significance of the SARS-CoV-2 ORF7b was unknown, it has been hypothesized that it mediates tumor necrosis factor (TNF)-induced apoptosis according to the cell culture evidence and the malfunctioning of olfactory receptors by inducing autoimmunity.

A comparable fragment of the 5’-UTR matching to the leader sequence inside the SARS-CoV-2 structural N gene at the serine and arginine (SR) region’s end was observed. The 5’-UTR-derived segment had modifications in five of the seven positions, consisting of R203K/G204R, which were well-known to be accompanying mutations in the N protein. However, the majority of the N protein sequences were highly conserved, with just one or two amino acid changes between isolates.

In another group of SARS-CoV-2 isolates, an identical 5′-UTR-derived sequence was present in the N. However, it lacked the leucine (L) residue and the phenylalanine (F) altered to S, bringing the sequence closer to that of the Wuhan reference strain.

A translated sequence, DLFSK, of a shorter segment of the 5’-UTR was found at the amino acids 36-40 of the nucleotidyltransferase (NiRAN) domain in SARS-CoV-2 isolates RdRp (Nsp12). The RdRp was usually well-conserved and had fewer chances for recombination among CoVs. Nevertheless, it engaged in the intragenomic restructuring of 5’-UTR-derived sequences. 

The 5’-UTR-derived sequences were found in the evaluated β-CoV Merbecovirus Middle Eastern respiratory syndrome CoV (MERS-CoV), Embevoviruses hCoV-OC43, and hCoV-HKU1, and bat Nobecovirus isolates. These rearrangements were in the intergenic regions between spike (S) and non-structural protein 5 (Ns5) and between S and NS4 of Embecoviruses hCoV-OC43 and hCoV-HKU-1, respectively. Further, the intergenomic rearrangements occurred at the carboxyl-terminus of ORF4b and between the ORFs 3 and 4a of the Merbecovirus MERS-CoV and at the Y1 cytoplasmic tail domain of Nsp3 of Eidolon and African Rousettus bat-derived Nobecoviruses. 

Nevertheless, no 5’-UTR insertions were observed in SARS-CoV-1, human α-CoVs, feline infectious peritonitis virus, Tegacovirus feline CoV, β-CoVs subgenus like bovine CoV, δ-CoVs subgenus like porcine delta CoV, and γ-CoVs subgenus like Turkey CoV. Further, 3’-UTR-derived insertions were absent in all the evaluated viruses.


The study findings presented the intragenomic rearrangements encompassing the 5’-UTR-derived sequences in the SARS-CoV-2 genome’s coding section. According to the authors, this is the first systematic description of 5’-UTR insertions in β-CoVs, including SARS-CoV-2.

The template-switching process appears to be involved or affected by the accessory and structural SARS-CoV-2 genes, which provide new homology areas for contact with TRS-L. The 5’-UTR-derived sequences engaged in intragenomic rearrangements in SARS-CoV-2 demonstrated in this study typically contain TRS-L. In addition, they cover about 50% of the 5′ conserved complementary sequences (CCSs), potentially promoting circularization of the genome from regions closer to the 3′-UTR. 

Overall, the present study on intragenomic rearrangements demonstrates the remarkable genomic flexibility of CoVs, which underpins the changes in virulence, immune escape, and transmissibility reported during the COVID-19 pandemic.

*Important notice

bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.


Leave a Reply

Your email address will not be published.