Retron

Contributors: Adi Millman, Héloïse Georjon, Aude Bernheim

Description

Retrons are distinct genetic elements found in bacterial genomes that code for a reverse transcriptase (RT) and a non-coding RNA (ncRNA). These elements generate a unique satellite DNA/RNA hybrid in the cell termed multicopy single-stranded DNA (msDNA). Retrons were recently found to function as anti-phage defense systems protecting bacteria against phage infection (N/A) . Their defensive unit is composed of three components: the reverse transcriptase, the non-coding RNA, and an effector protein.

Discovery

Discovery Retrons were originally discovered in 1984 in Myxococcus xanthus, when Yee et al. (N/A) identified a high copy, short, single-stranded linear ex-chromosomal DNA fragment in the gram-negative bacterium, Myxococcus xanthus. These multi-copy single-stranded DNA fragments were termed msDNA. Further studies showed that this single-stranded DNA (ssDNA) is covalently linked to an RNA molecule (N/A) . Although at the time reverse transcriptases were only known from Eukaryotes and viruses, Inouye and colleagues hypothesized that msDNA must be a product of a reverse transcription reaction (N/A) . Five years later an RT was shown to be associated with the biosynthesis of msDNA (N/A, N/A) , this was the first discovery of an RT in bacteria.

Although retrons were biochemically well studied and characterized, it was only 36 years after msDNA discovery, that their biological function was discovered (N/A) . In a systematic screen for the discovery of novel anti-phage defense systems in bacterial genomes (N/A) , Millman et al. discovered a new defense system that contained a retron element (Retron-Eco8), further analysis showed that retrons are enriched in bacterial defense islands and together with their accessory proteins many were shown to confer defense against phage infection (N/A) . An independent screen for defense systems, later that same year, also reported similar conclusions showing retrons function in antiphage defense (N/A) .

Due to their ability to produce a high copy of DNA within the cell, since their discovery retrons have served as a fertile ground for biotechnological applications (N/A, N/A, N/A)

Molecular mechanisms

General

When the retron ncRNA (msr-msd) is transcribed it folds into a typical structure that is recognized by the RT (N/A) . The RT then reverse transcribes a portion of the ncRNA (msd), starting from the 2′-end of a conserved guanosine residue found immediately after a double-stranded RNA structure within the ncRNA (N/A) . During reverse transcription, cellular RNase H degrades the segment of the ncRNA that serves as a template, but not other parts of the ncRNA (msr), yielding the mature RNA-DNA hybrid (msDNA) (N/A) . In some cases cellular nucleases have been shown to further process the msDNA (N/A, N/A, N/A) .

Retron-Eco6 (Ec48)

The Retron-Eco6 system encodes in addition to the retron an effector protein containing 2 transmembrane domains (2TM). Retron-Eco6 was shown to protect bacteria against phage through abortive infection (Abi) by guarding the integrity of the RecBCD complex in the cell. Many phages inhibit RecBCD to successfully infect the cell. Upon inhibition of RecBCD, the effector protein turns the membrane permeable and the cells lyse within 45 minutes post infection (N/A) .

Retron-Sen2 (St85), Retron-Eco9

The Retron-Sen2 system was shown to function as a three-partite toxin-antitoxin (TA) system. The accessory gene RcaT acts as a bona fide toxin and ectopically inhibits growth. The Retron-RT-msDNA complex acts as an antitoxin alleviating RcaT toxicity. Several triggers were identified for the Sen2-TA system, including Dam that was shown to methylate the mature msDNA and thus likely disrupt the RcaT–RT–msDNA complex, and RecE that degrades mature msDNA and reduces the RT-msDNA antitoxin levels (N/A)

Example of genomic structure

A total of 16 subsystems have been described for the Retron system.

Here are some examples found in the RefSeq database:

The Retron_II system in Agrobacterium tumefaciens (GCF_017726655.1, NZ_CP072309) is composed of 2 proteins NDT (WP_209089758.1) RT_Tot (WP_209089760.1)

The Retron_III system in Dokdonia sp. 4H-3-7-5 (GCF_000212355.1, NC_015496) is composed of 3 proteins PRTase (WP_148236012.1) WH (WP_013752369.1) RT_Tot (WP_013752370.1)

The Retron_IV system in Pseudomonas lurida (GCF_001708485.1, NZ_CP015639) is composed of 2 proteins RT_Tot (WP_081327059.1) 2TM (WP_145980332.1)

The Retron_I_A system in Hafnia alvei (GCF_902387815.1, NZ_LR699008) is composed of 3 proteins RT_Tot (WP_197737714.1) ATPase_TypeIA (WP_111329110.1) HNH_TIGR02646 (WP_111329111.1)

The Retron_I_B system in Dickeya zeae (GCF_012278555.1, NZ_CP033622) is composed of 2 proteins ATPase_TOPRIM_COG3593 (WP_168363308.1) RT_Tot (WP_168363309.1)

The Retron_I_C system in Proteus vulgaris (GCF_009931275.1, NZ_CP034668) is composed of 1 protein: RT_1_C1 (WP_017628371.1)

The Retron_VI system in Enterobacter roggenkampii (GCF_013728935.1, NZ_CP056148) is composed of 2 proteins HTH (WP_008499884.1) RT_Tot (WP_016243639.1)

The Retron_VII_1 system in Hypericibacter terrae (GCF_008728855.1, NZ_CP042906) is composed of 1 protein: RT_7_A1 (WP_151178207.1)

The Retron_VII_2 system in Sideroxydans lithotrophicus (GCF_000025705.1, NC_013959) is composed of 2 proteins RT_Tot (WP_013028226.1) DUF3800 (WP_013028227.1)

The Retron_XI system in Sphingopyxis granuli (GCF_022637755.1, NZ_CP093335) is composed of 1 protein: RT_11 (WP_241940850.1)

The Retron_XII system in Tenuifilum thalassicum (GCF_013265555.1, NZ_CP041345) is composed of 1 protein: RT_12 (WP_173072943.1)

The Retron_XIII system in Clostridium saccharobutylicum (GCF_002003365.1, NZ_CP016091) is composed of 3 proteins ARM (WP_022745963.1) WHSWIM (WP_022745966.1) RT_Tot (WP_022745969.1)

Distribution of the system among prokaryotes

Among the NaN complete genomes of RefSeq, the Retron is detected in NaN genomes (NaN %). The system was detected in NaN different species.
phylum
Percent genome having the system
0
100
Minimum genomes count to display

Structure

Summary
Group
Structure
Foldseek
System
Gene name
Subtype
Proteins in structure
System genes
Prediction type
N genes in sys
pLDDT
iptm+ptm
pDockQ
No data available

Experimental validation

      
graph LR;
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_0
    Origin_0[ SLATT + RT_G2_intron
Klebsiella pneumoniae's PICI KpCIUCICRE 8 
WP_023301280.1, WP_023301281.1] --> Expressed_0[Escherichia coli]
    Expressed_0[Escherichia coli] ----> T5 & HK97 & HK544 & HK578 & T7
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_0
    Origin_0[ SLATT + RT_G2_intron
Klebsiella pneumoniae's PICI KpCIUCICRE 8 
WP_023301280.1, WP_023301281.1] --> Expressed_1[Samonella enterica]
    Expressed_1[Samonella enterica] ----> P22 & BTP1 & ES18
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_0
    Origin_0[ SLATT + RT_G2_intron
Klebsiella pneumoniae's PICI KpCIUCICRE 8 
WP_023301280.1, WP_023301281.1] --> Expressed_2[Klebsiella pneumoniae]
    Expressed_2[Klebsiella pneumoniae] ----> Pokey & Raw & Eggy & KaID
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_1
    Origin_1[ RT Ec67 + TOPRIM
Klebsiella pneumoniae's PICI KpCIB28906 
WP_053810728.1] --> Expressed_3[Escherichia coli]
    Expressed_3[Escherichia coli] ----> T4 & T5 & HK578 & T7
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_1
    Origin_1[ RT Ec67 + TOPRIM
Klebsiella pneumoniae's PICI KpCIB28906 
WP_053810728.1] --> Expressed_4[Samonella enterica]
    Expressed_4[Samonella enterica] ----> det7
    Fillol-Salom_2022[Fillol-Salom et al., 2022] --> Origin_1
    Origin_1[ RT Ec67 + TOPRIM
Klebsiella pneumoniae's PICI KpCIB28906 
WP_053810728.1] --> Expressed_4[Samonella enterica]
    Expressed_4[Samonella enterica] ----> Pokey & KalD
    Gao_2020[Gao et al., 2020] --> Origin_2
    Origin_2[ Retron-TIR
Shigella dysenteriae 
WP_005025120.1] --> Expressed_5[Escherichia coli]
    Expressed_5[Escherichia coli] ----> T2 & T4 & T3 & T7 & PhiV-1
    Gao_2020[Gao et al., 2020] --> Origin_3
    Origin_3[ Retron Ec67 + TOPRIM
Escherichia coli 
WP_000169432.1] --> Expressed_6[Escherichia coli]
    Expressed_6[Escherichia coli] ----> T2 & T4 & T5
    Gao_2020[Gao et al., 2020] --> Origin_4
    Origin_4[ Retron Ec86 + Nuc_deoxy
Escherichia coli 
WP_001034589.1, WP_001320043.1] --> Expressed_7[Escherichia coli]
    Expressed_7[Escherichia coli] ----> T4
    Gao_2020[Gao et al., 2020] --> Origin_5
    Origin_5[ Retron Ec78 + ATPase + HNH
Escherichia coli 
WP_001549208.1, WP_001549209.1,
WP_001549210.1] --> Expressed_8[Escherichia coli]
    Expressed_8[Escherichia coli] ----> T5
    Millman_2020[Millman et al., 2020] --> Origin_6
    Origin_6[ Ec73
Escherichia coli 
WP_005025120.1*] --> Expressed_9[Escherichia coli]
    Expressed_9[Escherichia coli] ----> SECphi4 & SECphi6 & SECphi27 & P1 & T7
    Millman_2020[Millman et al., 2020] --> Origin_7
    Origin_7[ Ec86
Escherichia coli 
2514747571, 2514747569] --> Expressed_10[Escherichia coli]
    Expressed_10[Escherichia coli] ----> T5
    Millman_2020[Millman et al., 2020] --> Origin_8
    Origin_8[ Ec48
Escherichia coli 
2642317602, 2642317601] --> Expressed_11[Escherichia coli]
    Expressed_11[Escherichia coli] ----> Lambda-Vir & T5 & T2 & T4 & T7
    Millman_2020[Millman et al., 2020] --> Origin_9
    Origin_9[ Ec67
Escherichia coli 
2721121890] --> Expressed_12[Escherichia coli]
    Expressed_12[Escherichia coli] ----> T5
    Millman_2020[Millman et al., 2020] --> Origin_10
    Origin_10[ Se72
Salmonella enterica 
2633939248, 2633939247] --> Expressed_13[Escherichia coli]
    Expressed_13[Escherichia coli] ----> Lambda-Vir
    Millman_2020[Millman et al., 2020] --> Origin_11
    Origin_11[ Ec78
Escherichia coli 
2647069770, 2647069771,
2647069772] --> Expressed_14[Escherichia coli]
    Expressed_14[Escherichia coli] ----> T5
    Millman_2020[Millman et al., 2020] --> Origin_12
    Origin_12[ Ec83
Escherichia coli 
2712077840, 2712077841,
2712077841] --> Expressed_15[Escherichia coli]
    Expressed_15[Escherichia coli] ----> T2 & T4 & T6
    Millman_2020[Millman et al., 2020] --> Origin_13
    Origin_13[ Vc95
Vibrio cholerae 
2598877024, 2598877023,
2598877022] --> Expressed_16[Escherichia coli]
    Expressed_16[Escherichia coli] ----> T2 & T4 & T6
    Millman_2020[Millman et al., 2020] --> Origin_14
    Origin_14[ Retron-Eco8
Escherichia coli 
2693183786, 2693183785] --> Expressed_17[Escherichia coli]
    Expressed_17[Escherichia coli] ----> SECphi4 & SECphi6 & SECphi18 & T4 & T6 & T7
    Bobonis_2022[Bobonis et al., 2022] --> Origin_15
    Origin_15[ Retron-Sen2
Salmonella enterica serovar Typhimurium  
NP_462744.1, NP_462745.3] --> Expressed_18[Escherichia coli]
    Expressed_18[Escherichia coli] ----> T5
    Bobonis_2022[Bobonis et al., 2022] --> Origin_16
    Origin_16[ Retron-Eco9
Escherichia coli 
WP_000422112.1, WP_062914741.1] --> Expressed_19[Escherichia coli]
    Expressed_19[Escherichia coli] ----> P1vir & T2 & T3 & T5 & T7 & Ffm & Br60
    Bobonis_2022[Bobonis et al., 2022] --> Origin_17
    Origin_17[ Retron-Eco1
Escherichia coli 
WP_001320043.1, WP_001034589.1,
] --> Expressed_20[Escherichia coli]
    Expressed_20[Escherichia coli] ----> T5
    subgraph Title1[Reference]
        Fillol-Salom_2022
        Gao_2020
        Millman_2020
        Bobonis_2022
end
    subgraph Title2[System origin]
        Origin_0
        Origin_0
        Origin_0
        Origin_1
        Origin_1
        Origin_1
        Origin_2
        Origin_3
        Origin_4
        Origin_5
        Origin_6
        Origin_7
        Origin_8
        Origin_9
        Origin_10
        Origin_11
        Origin_12
        Origin_13
        Origin_14
        Origin_15
        Origin_16
        Origin_17
end
    subgraph Title3[Expression species]
        Expressed_0
        Expressed_1
        Expressed_2
        Expressed_3
        Expressed_4
        Expressed_4
        Expressed_5
        Expressed_6
        Expressed_7
        Expressed_8
        Expressed_9
        Expressed_10
        Expressed_11
        Expressed_12
        Expressed_13
        Expressed_14
        Expressed_15
        Expressed_16
        Expressed_17
        Expressed_18
        Expressed_19
        Expressed_20
end
    subgraph Title4[Protects against]
        T5
        HK97
        HK544
        HK578
        T7
        P22
        BTP1
        ES18
        Pokey
        Raw
        Eggy
        KaID
        T4
        T5
        HK578
        T7
        det7
        Pokey
        KalD
        T2
        T4
        T3
        T7
        PhiV-1
        T2
        T4
        T5
        T4
        T5
        SECphi4
        SECphi6
        SECphi27
        P1
        T7
        T5
        Lambda-Vir
        T5
        T2
        T4
        T7
        T5
        Lambda-Vir
        T5
        T2
        T4
        T6
        T2
        T4
        T6
        SECphi4
        SECphi6
        SECphi18
        T4
        T6
        T7
        T5
        P1vir
        T2
        T3
        T5
        T7
        Ffm
        Br60
        T5
end
    style Title1 fill:none,stroke:none,stroke-width:none
    style Title2 fill:none,stroke:none,stroke-width:none
    style Title3 fill:none,stroke:none,stroke-width:none
    style Title4 fill:none,stroke:none,stroke-width:none