pAgo
Description
Argonaute proteins comprise a diverse protein family and can be found in both prokaryotes and eukaryotes (N/A) . Despite low sequence conservation, eAgos and long pAgos generally have a conserved domain architecture and share a common mechanism of action; they use a 5'-phosphorylated single-stranded nucleic acid guide (generally 15-22 nt in length) to target complementary nucleic acid sequences (N/A) eAgos strictly mediate RNA-guided RNA silencing, while pAgos show higher mechanistic diversification, and can make use of guide RNAs and/or single-stranded guide DNAs to target RNA and/or DNA targets (N/A) . Depending on the presence of catalytic residues and the degree of complementarity between the guide and target sequences, eAgo and pAgos either cleave the target or recruit and/or activate accessory proteins. This can result in degradation of the target nucleic acid, but might also trigger alternative downstream effects, ranging from poly(A) tail shortening and RNA decapping (N/A) or chromatin formation in eukaryotes (N/A) , to abortive infection in prokaryotes (N/A) .
Molecular mechanism
Based on their phylogeny, Agos have been subdivided into various (sub)clades. eAgos are generally subdivided into the AGO and PIWI clades, but these will not be discussed further here. pAgos can be further subdivided into long-A pAgos, long-B pAgos, short pAgos, SiAgo-like pAgos, and PIWI-RE proteins (N/A, N/A, N/A, N/A, N/A) . Below, we briefly outline the general mechanism of pAgos that have a demonstrated role in host defense.
Long-A pAgos
Akin to eAgos, most long A-pAgos characterized to date have a N-L1-PAZ-L2-MID-PIWI domain architecture (N/A) . In contrast to eAgos, however, certain long-A pAgos use a single-stranded guide DNA to bind and cleave complementary target DNA sequences (N/A, N/A, N/A, N/A, N/A, N/A) . Long-A pAgos are preferentially programmed with guide DNAs targeting invading DNA through a poorly understood mechanism, which might involve DNA repair proteins (N/A) or the pAgo itself (N/A, N/A) . Most long-A pAgos have an intact catalytic site in the PIWI domain which allows to cleave their targets (N/A) . As such, they act as an innate immune system that clears plasmid and phage DNA from the cell (N/A, N/A, N/A, N/A, N/A) .
Within the long-A pAgo clade, various subclades of other pAgos exist that rely on distinct function mechanisms. For example, various long-A pAgo can (additionally) use guide RNAs and/or cleave RNA targets. Furthermore, CRISPR-associated pAgos use 5'-OH guide RNAs to target DNA (N/A) , and PliAgo-like pAgos use small DNA guides to target RNA (N/A) . Certain long-A pAgos genetically co-localize with other putative enzymes including (but not limited to) putative nucleases, helicases, DNA-binding proteins, or PLD-like proteins (N/A, N/A) . The relevance of these associations is currently unknown.
Long-B pAgos
Akin to long-A pAgogs, long B-pAgos have a N-L1-PAZ-L2-MID-PIWI domain composition, but most have a shorter PAZ* domain, and in contrast to long-A pAgos all long-B pAgos are catalytically inactive (N/A) . Long-B pAgos characterized to date use guide RNAs to bind invading DNA (N/A, N/A, N/A) . In the absence of co-encoded proteins, long-B pAgos repress invader activity (N/A) . In addition, most long-B pAgos are co-encoded with effector proteins including (but not limited to) SIR2, nucleases, membrane proteins, and restriction endonucleases (N/A, N/A, N/A, N/A) . These effector proteins are activated upon pAgo-mediated invader detection, and generally catalyze reactions that result in cell death (N/A) . As such, long-B pAgo together with their associated proteins mediate abortive infection.
Short pAgos
Short pAgos are truncated: they only contain the MID and PIWI domains essential for guide-mediate target binding (N/A) . They are catalytically inactive and are co-encoded with an APAZ domain that is fused to one of the various effector domains. In short pAgo systems characterized to date, the short pAgo and the APAZ domain-containing protein form a heterodimeric complex (N/A, N/A) . Within this complex, the short pAgo uses a guide RNA to bind complementary target DNAs. This triggers catalytic activation of the effector domain fused to the APAZ domain, generally resulting in cell death (N/A, N/A) . As such, short pAgo systems mediate abortive infection.
Based on their phylogeny, short pAgos are subdivided in S1A, S1B, S2A, and S2B clades (N/A, N/A) . In clade S1A and S1B (SPARSA) systems, APAZ is fused to a SIR2 domain. In clade S2A (SPARTA) systems, APAZ is fused to a TIR domain. Both SPARSA and SPARTA systems trigger cell death by depletion of NAD(P)+ (N/A, N/A) . In S2B clade systems, APAZ is fused to one or more effector domains, including Mrr-like, DUF4365, RecG/DHS-like and other domains. In all clade S1A SPARSA systems, but also for certain other systems within other clades, the effector-APAZ is fused to the short pAgo.
Pseudo-short pAgos
Akin to short pAgos, pseudo-short pAgos are comprised of the MID and PIWI domains only (N/A) . However, they do not phylogenetically cluster with canonical short pAgos and do not colocalize with effector-APAZ proteins. Instead, certain pseudo-short are found across the long-A and long-B pAgo clades (e.g. Archaeoglobus fulgidus pAgo, a truncated long-B pAgo (N/A, N/A) ), while others form a distinct branch in the phylogenetic pAgo tree (see SiAgo-like pAgos below).
SiAgo-like pAgos
SiAgo-like pAgos are pseudo-short pAgos that form a separate branch in the phylogenetic tree of pAgos. They are named after the type system from Sulfolobus islandicus (N/A) . SiAgo is comprised of MID and PIWI domains and is co-encoded with Ago-associated proteins Aga1 and Aga2. SiAgo and Aga1 form a cytoplasmic heterodimeric complex. While it is currently unknown what guide/target types activate the SiAgo/Aga1 complex, it is directed toward membrane-localized Aga2 upon viral infection. This triggers Aga2-mediated membrane depolarization and causes cell death (N/A) .
Example of genomic structure
A total of 6 subsystems have been described for the pAgo system. Here are some examples found in the RefSeq database:
The pAgo_LongA system in Halosimplex pelagicum (GCF_013415905.1, NZ_CP058909) is composed of 1 protein: pAgo_LongA (WP_179918860.1)
The pAgo_LongB system in Serratia fonticola (GCF_019252525.1, NZ_CP072742) is composed of 2 proteins pAgo_LongB (WP_218520044.1) EcAgaN (WP_235784821.1)
The pAgo_S1A system in Parabacteroides merdae (GCF_020735605.1, NZ_CP085927) is composed of 2 proteins pAgo_S1A (WP_227945673.1) pAgo_S1A (WP_227945674.1)
The pAgo_S1B system in Comamonas flocculans (GCF_007954405.1, NZ_CP042344) is composed of 2 proteins SIR2APAZ (WP_146914209.1) pAgo_S1B (WP_146913473.1)
The pAgo_S2B system in Granulicella tundricola (GCF_000178975.2, NC_015064) is composed of 2 proteins XAPAZ (WP_013581437.1) pAgo_S2B (WP_013581438.1)
The pAgo_SPARTA system in Roseivivax sp. THAF30 (GCF_009363575.1, NZ_CP045389) is composed of 2 proteins TIRAPAZ (WP_152461295.1) pAgo_SPARTA (WP_152461296.1)
Distribution of the system among prokaryotes
Structure
Group | Structure | Foldseek | System | Gene name | Subtype | Proteins in structure | System genes | Prediction type | N genes in sys | pLDDT | iptm+ptm | pDockQ |
---|---|---|---|---|---|---|---|---|---|---|---|---|
No data available |