CRISPR-Cas

For the CRISPR-Cas system, a good place to start is the Wikipedia page

Example of genomic structure

CRISPR-Cas systems have been classified into 6 different families () . Each family is composed of different subtypes. For example, Type I CRISPR is composed of 7 subtypes: I-A to I-G.

Here is an example of each of the 6 families found in the RefSeq database:

The CAS_Class1-Subtype-I-E system in Citrobacter sp. RHBSTW-00017 (GCF_013797615.1, NZ_CP056899) is composed of 8 proteins cas3_I_5 (WP_103284157.1) cas8e_I-E_1 (HV037_RS05730) cse2gr11_I-E_2 (HV037_RS05735) cas7_I-E_2 (HV037_RS05740) cas5_I-E_3 (HV037_RS05745) cas6e_I_II_III_IV_V_VI_1 (HV037_RS05750) cas1_I-E_1 (HV037_RS05755) cas2_I-E_2 (HV037_RS05760)

The CAS_Class2-Subtype-II-A system in Streptococcus agalactiae (GCF_001190885.1, NZ_CP011329) is composed of 4 proteins cas9_II-A_II-B_II-C_3 (SAH002_RS04760) cas1_I_II_III_IV_V_VI_5 (SAH002_RS04765) cas2_I_II_III_IV_V_VI_6 (SAH002_RS04770) csn2_II-A_4 (SAH002_RS04775)

The CAS_Class1-Subtype-III-A system in Mycobacterium tuberculosis (GCF_014900005.1, NZ_CP041828) is composed of 9 proteins cas2_I_II_III_IV_V_VI_5 (FPJ80_RS14760) cas1_I_II_III_IV_V_VI_8 (FPJ80_RS14765) csm6_III_2 (FPJ80_RS14770) csm5gr7_III-A_3 (FPJ80_RS14775) csm4gr5_III-A_3 (FPJ80_RS14780) csm3gr7_III-A_1 (FPJ80_RS14785) csm2gr11_III-A_1 (FPJ80_RS14790) cas10_III_7 (FPJ80_RS14795) cas6_I_II_III_IV_V_VI_15 (FPJ80_RS14800)

The CAS_Class1-Subtype-IV-A system in Shigella flexneri (GCF_022353685.1, NZ_CP054978) is composed of 5 proteins csf1gr8_IV-A_3 (WP_038989757.1) cas6e_I_II_III_IV_V_VI_3 (WP_038989755.1) csf4_IV-A_1 (WP_016947078.1) csf3gr5_IV-A_1 (WP_004181864.1) csf2gr7_IV-A_1 (WP_029505552.1)

The CAS_Class2-Subtype-V-A system in Francisella tularensis (GCF_001865695.1, NZ_CP016635) is composed of 4 proteins cas2_I_II_III_IV_V_VI_3 (N894_RS07580) cas1_I_II_III_IV_V_VI_1 (N894_RS07585) cas4_V_1 (N894_RS07590) cas12a_V-A_4 (N894_RS07595)

The CAS_Class2-Subtype-VI-A system in Leptotrichia shahii (GCF_008327825.1, NZ_AP019827) is composed of 3 proteins cas13a_VI-A_1 (F1564_RS00570) cas1_I_II_III_IV_V_VI_5 (F1564_RS00575) cas2_I_II_III_IV_V_VI_11 (F1564_RS00580)

Distribution of the system among prokaryotes

Among the 22,803 complete genomes of RefSeq, the AbiC is detected in 8581 genomes (37.63 %).

The system was detected in 2905 different species.

Proportion of genome encoding the AbiC system for the 14 phyla with more than 50 genomes in the RefSeq database.

References

10.1038/s41579-019-0299-x
no authors
no containerTitle ()