从输入生物学相关数据到输出信息和知识的过程属于生物信息学的研究范畴。

Interests & Ongoing Projects

Topics | 主题Projects | 项目
Comparative genomics Riceome | 稻组Riceome | 稻组
RGI: a comprehensive pan-genome database for comparative and functional genomics of Asian riceFollow us on WeChat
Follow us on WeChat
Genome sequencing & assemblyRiceRice Information Gateway
Soybean (JD17 | Jack)Genome JBrowse
SafflowerThe Genome Database of Carthamus tinctorius 🔗
Fungi, etc.
Software developmentGenome Puzzle MasterDownload⏬

Dynamic Elongation of a Genome Assembly Path
Download⏬
You are welcome to join us or collaborate with us!

Achievements of Collaboration

1Input | 输入: 3 SMRT cells of CLR data and 1 SMRT cell of HiFi data
Output | 输出: A complete reference genome for the soybean cv. Jack
Significance | 意义:
We assembled a gap-free T2T soybean genome and provided details of heterochromatic regions of all 20 chromosomes. The availability of this complete genome of a transformation-favored cultivar allowed us to uncover 337 genes that were previously unresolved, and will be a valuable resource to investigate current and emerging agricultural issues.
2Input | 输入: 16 high-quality genomes, 18 annotations and transcriptome data for 16 Asian rice accessions
Output | 输出: Rice Gene Index
Significance | 意义:
To integrate the genomic information of the rice pan-genome, we performed first gene based pan-genome database Rice Gene Index (RGI) platform with 16 platinum standard reference genomes and supplementary transcriptome data. Our RGI comprehensively creates and focuses on gene-level relationships across representative Asian rice accessions, establishes a standardized gene index for Asian rice, and provides rich search and visualization capabilities for the whole rice research community.
3Input | 输入: 75 high-quality genomes, annotations and transcriptome data + re-sequencing data for O. sativa and wild relatives
Output | 输出: Pan-genome inversion index of Asian rice
Significance | 意义:
Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here we utilize 73 high-quality genomes that encompass the subpopulation structure of Asian rice (O. sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1,769 non-redundant inversions that span an average of ~ 29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥ 100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.
4Input | 输入: Data generated for ZS97(23x HiFi, 131x CLR)and MH63(103x HiFi, 132x CLR)with PacBio SMRT technology…
Output | 输出: Two gap-free reference genomes of rice
Significance | 意义:
We report the first two gap-free reference genomes of O. sativa xian/indica rice varieties ZS97 and MH63 being used as a model system for studying heterosis and yield, and show that architectures of functional centromeres are extensively variable at the sequence level. These will setup a new standard for reference genome resources in plant biology. The new references provide a clear picture of the primary sequence architecture of the xian/indica rice genomes that feed the world, and could help in the breeding of climate resilient varieties.
水稻是全世界主要主食之一,也是植物基因组学和育种的模型系统,是近20年前第一个测定基因组的作物。然而,迄今为止所有正式发表的高等生物参考基因组都包含缺口/缺失序列,我们在2020年底率先在BioRxiv预印公布了两个籼稻的无缺口参考基因组序列,填补了全球基因组学领域空白。正式在《Molecular Plant》报道的成果是植物中首例无缺口参考基因组,不仅为全面解析水稻着丝粒的结构和功能提供机会,促进了解植物的基因组结构和功能,而且对利用基因组育种手段培育21世纪农业气候适应性品种具有长期和持久影响。
5Input | 输入: Data generated by single‐molecule real‐time sequencing and Hi‐C mapping technologies
Output | 输出: The chromosome‐scale reference genome of safflower
Significance | 意义:
红花(Safflower, Carthamus tinctorius)属于菊科,因含有高亚油酸和黄酮受到食品及中医药研究者的广泛关注。作为不饱和脂肪酸,亚油酸具有降血脂、预防动脉粥样硬化等功效,亚油酸在红花籽粒的比率高达总油含量的80%,被誉为“亚油酸之王”;同时,其花的黄酮种类及含量丰富,尤其是羟基红花黄色素A,具有缓解炎症、抗肿瘤、活血化瘀等功效。红花高质量的参考基因组为深入解析亚油酸和黄酮的分子调控机制提供了”图谱”基础,有助于加速红花药用、食用价值及其它农艺性状的提升和改良。
6Input | 输入: PacBio and Illumina sequencing data and Bionano optical maps for 12 cultivated Asian rice
Output | 输出: A platinum standard pan-genome resource
Significance | 意义:
As the human population grows from 7.8 billion to 10 billion over the next 30 years, breeders must do everything possible to create crops that are highly productive and nutritious, while simultaneously having less of an environmental footprint. Rice will play a critical role in meeting this demand and thus, knowledge of the full repertoire of genetic diversity that exists in germplasm banks across the globe is required. To meet this demand, we describe the generation, validation and preliminary analyses of transposable element and long-range structural variation content of 12 near-gap-free reference genome sequences (RefSeqs) from representatives of 12 of 15 subpopulations of cultivated Asian rice. When combined with 4 existing RefSeqs, that represent the 3 remaining rice subpopulations and the largest admixed population, this collection of 16 Platinum Standard RefSeqs (PSRefSeq) can be used as a template to map resequencing data to detect virtually all standing natural variation that exists in the pan-genome of cultivated Asian rice.
7Input | 输入: Map-based reference genomes of indica rice and multi-omics data
Output | 输出: An integrative and comprehensive platform — Rice Information GateWay
Significance | 意义:
Rice Information GateWay (RIGW) provides genomics, transcriptomics, protein–protein interactions (PPIs), metabolic network, metabolites, and computational tools. RIGW serves the rice community by making a wealth of genomics and other omics data available through an intuitive web-based interface.
8Input | 输入: PacBio long-read and Illumina paired-end sequencing data
Output | 输出: The reference genomes of two elite indica rice varieties
Significance | 意义:
Indica rice accounts for >70% of total rice production worldwide, is genetically highly diverse, and can be divided into two major varietal groups independently bred and widely cultivated in China and Southeast Asia. Here, we generated high-quality genome sequences for two elite rice varieties, Zhenshan 97 and Minghui 63, representing the two groups of indica rice and the parents of a leading rice hybrid. Comparative analyses uncovered extensive structural differences between the two genomes and complementarity in their hybrid transcriptome. These findings have general implications for understanding intraspecific variations of organisms with complex genomes. The availability of the two genomes will serve as a foundation for future genome-based explorations in rice toward both basic and applied goals.
9Input | 输入: Interest, experience, time …
Output | 输出: Genome Puzzle Master
Significance | 意义:
Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules.
The more you know, the more you know you don’t know.
Back: GUAN/Enlai, TONG/Jiazhou, XIE/Wenzhao, CHENG/Rundong, YU/Zhichao, ZHAO/Wen, LI/Siyuan, ZHANG/Jianwei
Front: ZHANG/Yulu, HUANG/Yicheng, ZHANG/Wenhui, LUO/Rongfei, WANG/Huan, LI/Shanying, LIU/Jian, XIA/Dandan, CHEN/Ying
(January 7, 2024)