ESM-2 |
esm2_t36_3B_UR50D() esm2_t48_15B_UR50D() |
UR50 (sample UR90) |
SOTA general-purpose protein language model. Can be used to predict structure, function and other protein properties directly from individual sequences. Released with Lin et al. 2022 (Aug 2022 update). |
ESMFold |
esmfold_v1() |
PDB + UR50 |
End-to-end single sequence 3D structure predictor (Nov 2022 update). |
ESM-MSA-1b |
esm_msa1b_t12_100M_UR50S() |
UR50 + MSA |
MSA Transformer language model. Can be used to extract embeddings from an MSA. Enables SOTA inference of structure. Released with Rao et al. 2021 (ICML'21 version, June 2021). |
ESM-1v |
esm1v_t33_650M_UR90S_1() ... esm1v_t33_650M_UR90S_5() |
UR90 |
Language model specialized for prediction of variant effects. Enables SOTA zero-shot prediction of the functional effects of sequence variations. Same architecture as ESM-1b, but trained on UniRef90. Released with Meier et al. 2021. |
ESM-IF1 |
esm_if1_gvp4_t16_142M_UR50() |
CATH + UR50 |
Inverse folding model. Can be used to design sequences for given structures, or to predict functional effects of sequence variation for given structures. Enables SOTA fixed backbone sequence design. Released with Hsu et al. 2022. |