en Research Research Advancements
BioRxiv & BioRxiv | Latest Advances in CryoSeek: Teams Led by Nieng Yan and Mingxu Hu Decipher Novel Glycofibril Structures and Propose a New Algorithm for Absolute hand Determination
2025-10-04 -

Carbohydrates play crucial roles in life activities, such as energy supply, cellular signal transduction, and immune recognition, and their multi-dimensional functions depend on their unique stereostructures. However, compared with nucleic acids and proteins, research on the structure and function of carbohydrates is relatively lagging. This is mainly due to the diversity of monosaccharide molecule types, the presence of multiple chiral centers, the complex and variable linkage modes of glycosidic bonds when forming glycans, and the high structural flexibility—all of which make the resolution of high-resolution 3D structures extremely difficult, seriously limiting our in-depth understanding of their structure and function.  


In recent years, the team led by Dr. Nieng Yan has proposed the "CryoSeek" research strategy, which uses cryo-EM as a discovery tool for unknown bio-entities without prior knowledge. Therefore, the CryoSeek strategy may represent a type of "forward structural biology" research—a biology discovery paradigm led by structure. In previous studies, the Nieng Yan team, by combining Cryo-EM technology, AI-assisted automated modeling, and bioinformatics analysis, reported a fibrillar protein structure TLP-1 from environmental samples of the lotus pond at Tsinghua University, and speculated on its origin and potential functions. Subsequent studies further identified a novel glycofibril structure TLP-4, whose core is a linear polypeptide composed of tetrapeptide repeat sequences, surrounded by a thick layer of sugar chains. The tetrapeptide repeat segment contains a conserved 3,4-dihydroxyproline (DiHyp), and both its 3-OH and 4-OH are highly O-glycosylated. Adjacent to DiHyp, there is also a conserved serine or threonine modified by O-glycosylation. These research results not only reveal the important role of glycans in the structural assembly of biological macromolecules but also provide new ideas for the discovery and structural resolution of natural carbohydrates [1, 2, 3].  


However, there is a fundamental challenge when using cryo-EM to resolve structures: cryo-EM imaging loses the "absolute hand" information of the structure, which leads to two possible "mirror-inverted" versions of the finally reconstructed 3D density map. For proteins, scientists can use the prior knowledge that all their α-helices are right-absolute handed for correction. But glycofibrils are mainly composed of carbohydrates, without α-helices, and both D-type and L-type carbohydrates exist in nature. Therefore, even at atomic resolution, absolute hand cannot be directly determined. This uncertainty is like "losing the north-south direction on a map," making it difficult to construct correct atomic models.  


Previously, the classic method to solve this problem was the "tilt-pair" imaging technology, which requires prior knowledge of the reference structure of the sample and tilt-pair imaging of the same region. For highly heterogeneous samples directly obtained from environments such as natural water bodies—samples that may contain hundreds of different glycofibril structures—traditional methods are almost inapplicable.  


On October 1, 2025, teams led by Dr. Nieng Yan and Dr. Mingxu Hu, together with their collaborators, shared two of their newest research results on the preprint server BioRxiv. The titles of the manuscripts are "CryoSeek identification of glycofibrils with diverse compositions and structural assemblies" and "Absolute hand determination of glycofibrils from natural sources in cryo-EM," respectively. 


转存图片

转存图片


Continuing the "Moonlight Over the Tsinghua Lotus Pond" research program, the team resolved five other novel glycofibril structures in subsequent data processing, revealing the diversity of glycofibril structures in natural environments and the key role of carbohydrates in the structural assembly process. At the same time, to solve the problem of determining the absolute hand of glycofibrils, the research teams jointly developed a new method named Ahaha. This method only requires conventional cryo-EM imaging under single-angle tilting to efficiently and accurately determine the absolute hand of naturally derived glycofibrils. Currently, an online service for this method has been launched.  


转存图片

Figure 1, Ahaha online service (https://cryoseek.org/ahaha)


转存图片

Figure 2, Identification of multiple fibrous structures in the Tsinghua Lotus Pond using the CryoSeek research strategy.


The five newly resolved glycofibrils are named TLP-IPT, TLP-12, TLP-3, TLP-2, and TLP-0 respectively. The prefix "TLP" is the abbreviation of "Tsinghua Lotus Pond," and the suffix reflects the characteristics of the protein core in each glycofibril. TLP-IPT has a recognizable protein core composed of continuous IPT (Iglike/plexins/transcription factors) domains, and each IPT domain is surrounded by 13 sugar chains; TLP-12 is formed by three strands of highly repetitive dodecapeptides woven together into a triple parallel β-sheet belt, with two columns of helical glycan ridges covering the outside; both TLP-3 and TLP-2 have linear polypeptides as their cores—TLP-3 has a thin protein core formed by the twisting of three strands of tripeptide repeat sequences, while TLP-2 is composed of a single strand of dipeptide repeat sequences, and both are wrapped with a thick layer of sugar chains on the outside; TLP-0 is entirely composed of carbohydrates, and the "0" indicates that it contains no protein components. The above studies not only reveal the structural and compositional diversity of glycofibrils in natural environments but also indicate that under the framework of the CryoSeek strategy, combining multi-disciplinary research methods is expected to provide a new path for high-throughput and systematic research on carbohydrate sequencing and the 3D structures of natural carbohydrates.  


转存图片

Figure 3, Schematic illustration of the principle by which Ahaha determines chirality.


To solve the absolute hand problem of glycofibrils, the research team proposed the innovative Ahaha algorithm. The principle of Ahaha is as follows: helical structures that are mirror volumes of each other have different projections at the same tilt angle but the same projections at opposite tilt angles. From this, it can be known that helices with different absolute hands should have opposite tilts. Since glycofibrils with a helical structure must lie flat on the sample plane, their tilts must be parallel to the sample itself. By analyzing whether the particle tilt in high-resolution helical reconstruction is consistent with the tilt of the sample itself, we can determine whether the absolute hand of the reconstructed high-resolution density map is correct. Using Ahaha, the team successfully completed the absolute hand determination of four glycofibrils found in a natural freshwater sample (stalagmite dripping water in a karst cave) and constructed their accurate atomic models accordingly. This marks a substantial step forward in our structural research on carbohydrates—the "dark matter" of life. The launch of this method will provide a powerful tool for researchers in related fields around the world (online service address: https://cryoseek.org/ahaha) and is expected to accelerate breakthrough discoveries in the field of glycobiology.  


转存图片

Figure 4, Determination of the chirality of four naturally derived polysaccharide fibers using Ahaha and construction of their atomic models.


Dr. Nieng Yan (Shenzhen Medical Academy of Research and Translation & Shenzhen Bay Laboratory), Dr. Zhangqiang Li, and Dr. Tongtong Wang (School of Life Sciences, Tsinghua University) served as corresponding authors of the first preprint. Dr. Zhangqiang Li, Dr. Tongtong Wang, and Yitong Sun were co–first authors. The work also benefited from important contributions by Dr. Kui Xu, Dr. Wenze Huang, Dr. Qiangfeng Zhang, Dr. Chuangye Yan, and Dr. Mingxu Hu.


The second preprint was jointly led by Dr. Mingxu Hu (Shenzhen Medical Academy of Research and Translation), Dr. Jiawei Wang (Tsinghua University), and Dr. Nieng Yan, with Dr. Qi Zhang as the first author. Lanjv Qin, Dr. Tongtong Wang, Dr. Zhangqiang Li, Yilin Zhang, and Dr. Sheng Chen also made significant contributions.


These studies were supported by the Shenzhen Medical Academy of Research and Translation, the National Natural Science Foundation of China, and the Beijing Frontier Research Center for Biological Structure.


References:  

[1] Wang, T., Li, Z., Xu, K., Huang, W., Huang, G., Zhang, Q. C., & Yan, N. (2024). CryoSeek: A strategy for bioentity discovery using cryoelectron microscopy. Proceedings of the National Academy of Sciences, 121(42), e2417046121.  

[2] Wang, T., Huang, W., Xu, K., Sun, Y., Zhang, Q.C., Yan, C., Li, Z., & Yan, N. (2025). CryoSeek II: Cryo-EM analysis of glycofibrils from freshwater reveals well-structured glycans coating linear tetrapeptide repeats, Proceedings of the National Academy of Sciences, 122(1), e2423943122.  

[3] Wang, T., Sun, Y., Li, Z., & Yan, N. (2024). The 8-nm spaghetti: well-structured glycans coating linear tetrapeptide repeats discovered from freshwater with CryoSeek. bioRxiv, 2024-12.