NobleBlocks

Agricultural Genomics Institute at Shenzhen

facilityShenzhen, China

Research output, citation impact, and the most-cited recent papers from Agricultural Genomics Institute at Shenzhen (China). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
4.9K
Citations
355.0K
h-index
241
i10-index
5.4K
Also known as
Agricultural Genomics Institute at Shenzhen中国农业科学院深圳农业基因组研究所

Top-cited papers from Agricultural Genomics Institute at Shenzhen

Genomic variation in 3,010 diverse accessions of Asian cultivated rice
Wensheng Wang, Ramil Mauleon, Zhiqiang Hu, Dmytro Chebotarov +4 more
2018· Nature1.9Kdoi:10.1038/s41586-018-0063-9

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.

A chemical genetic roadmap to improved tomato flavor
Denise M. Tieman, Guangtao Zhu, Márcio F. R. Resende, Tao Lin +4 more
2017· Science827doi:10.1126/science.aal1556

Modern commercial tomato varieties are substantially less flavorful than heirloom varieties. To understand and ultimately correct this deficiency, we quantified flavor-associated chemicals in 398 modern, heirloom, and wild accessions. A subset of these accessions was evaluated in consumer panels, identifying the chemicals that made the most important contributions to flavor and consumer liking. We found that modern commercial varieties contain significantly lower amounts of many of these important flavor chemicals than older varieties. Whole-genome sequencing and a genome-wide association study permitted identification of genetic loci that affect most of the target flavor chemicals, including sugars, acids, and volatiles. Together, these results provide an understanding of the flavor deficiencies in modern commercial varieties and the information necessary for the recovery of good flavor through molecular breeding.

Cytosine base editor generates substantial off-target single-nucleotide variants in mouse embryos
Erwei Zuo, Yidi Sun, Wei Wu, Tanglong Yuan +4 more
2019· Science793doi:10.1126/science.aav9973

Spotting off-targets from gene editing Unintended genomic modifications limit the potential therapeutic use of gene-editing tools. Available methods to find off-targets generally do not work in vivo or detect single-nucleotide changes. Three papers in this issue report new methods for monitoring gene-editing tools in vivo (see the Perspective by Kempton and Qi). Wienert et al. followed the recruitment of a DNA repair protein to DNA breaks induced by CRISPR-Cas9, enabling unbiased detection of off-target editing in cellular and animal models. Zuo et al. identified off-targets without the interference of natural genetic heterogeneity by injecting base editors into one blastomere of a two-cell mouse embryo and leaving the other genetically identical blastomere unedited. Jin et al. performed whole-genome sequencing on individual, genome-edited rice plants to identify unintended mutations. Cytosine, but not adenine, base editors induced numerous single-nucleotide variants in both mouse and rice. Science , this issue p. 286 , p. 289 , p. 292 ; see also p. 234

Computer vision technology in agricultural automation —A review
Hongkun Tian, Tianhai Wang, Yadong Liu, Xi Qiao +1 more
2019· Information Processing in Agriculture659doi:10.1016/j.inpa.2019.09.006

Computer vision is a field that involves making a machine “see”. This technology uses a camera and computer instead of the human eye to identify, track and measure targets for further image processing. With the development of computer vision, such technology has been widely used in the field of agricultural automation and plays a key role in its development. This review systematically summarizes and analyzes the technologies and challenges over the past three years and explores future opportunities and prospects to form the latest reference for researchers. Through the analyses, it is found that the existing technology can help the development of agricultural automation for small field farming to achieve the advantages of low cost, high efficiency and high precision. However, there are still major challenges. First, the technology will continue to expand into new application areas in the future, and there will be more technological issues that need to be overcome. It is essential to build large-scale data sets. Second, with the rapid development of agricultural automation, the demand for professionals will continue to grow. Finally, the robust performance of related technologies in various complex environments will also face challenges. Through analysis and discussion, we believe that in the future, computer vision technology will be combined with intelligent technology such as deep learning technology, be applied to every aspect of agricultural production management based on large-scale datasets, be more widely used to solve the current agricultural problems, and better improve the economic, general and robust performance of agricultural automation systems, thus promoting the development of agricultural automation equipment and systems in a more intelligent direction.

Penaeid shrimp genome provides insights into benthic adaptation and frequent molting
Xiaojun Zhang, Jianbo Yuan, Yamin Sun, Shihao Li +4 more
2019· Nature Communications566doi:10.1038/s41467-018-08197-4

Crustacea, the subphylum of Arthropoda which dominates the aquatic environment, is of major importance in ecology and fisheries. Here we report the genome sequence of the Pacific white shrimp Litopenaeus vannamei, covering ~1.66 Gb (scaffold N50 605.56 Kb) with 25,596 protein-coding genes and a high proportion of simple sequence repeats (>23.93%). The expansion of genes related to vision and locomotion is probably central to its benthic adaptation. Frequent molting of the shrimp may be explained by an intensified ecdysone signal pathway through gene expansion and positive selection. As an important aquaculture organism, L. vannamei has been subjected to high selection pressure during the past 30 years of breeding, and this has had a considerable impact on its genome. Decoding the L. vannamei genome not only provides an insight into the genetic underpinnings of specific biological processes, but also provides valuable information for enhancing crustacean aquaculture.

Graph pangenome captures missing heritability and empowers tomato breeding
Yao Zhou, Zhiyang Zhang, Zhigui Bao, Hongbo Li +4 more
2022· Nature510doi:10.1038/s41586-022-04808-9

Abstract Missing heritability in genome-wide association studies defines a major problem in genetic analyses of complex biological traits 1,2 . The solution to this problem is to identify all causal genetic variants and to measure their individual contributions 3,4 . Here we report a graph pangenome of tomato constructed by precisely cataloguing more than 19 million variants from 838 genomes, including 32 new reference-level genome assemblies. This graph pangenome was used for genome-wide association study analyses and heritability estimation of 20,323 gene-expression and metabolite traits. The average estimated trait heritability is 0.41 compared with 0.33 when using the single linear reference genome. This 24% increase in estimated heritability is largely due to resolving incomplete linkage disequilibrium through the inclusion of additional causal structural variants identified using the graph pangenome. Moreover, by resolving allelic and locus heterogeneity, structural variants improve the power to identify genetic factors underlying agronomically important traits leading to, for example, the identification of two new genes potentially contributing to soluble solid content. The newly identified structural variants will facilitate genetic improvement of tomato through both marker-assisted selection and genomic selection. Our study advances the understanding of the heritability of complex traits and demonstrates the power of the graph pangenome in crop breeding.

Biosynthesis, regulation, and domestication of bitterness in cucumber
Yi Shang, Yongshuo Ma, Yuan Zhou, Huimin Zhang +4 more
2014· Science490doi:10.1126/science.1259215

Cucurbitacins are triterpenoids that confer a bitter taste in cucurbits such as cucumber, melon, watermelon, squash, and pumpkin. These compounds discourage most pests on the plant and have also been shown to have antitumor properties. With genomics and biochemistry, we identified nine cucumber genes in the pathway for biosynthesis of cucurbitacin C and elucidated four catalytic steps. We discovered transcription factors Bl (Bitter leaf) and Bt (Bitter fruit) that regulate this pathway in leaves and fruits, respectively. Traces in genomic signatures indicated that selection imposed on Bt during domestication led to derivation of nonbitter cucurbits from their bitter ancestors.

The Sinocyclocheilus cavefish genome provides insights into cave adaptation
Junxing Yang, Xiaoli Chen, Jie Bai, Dongming Fang +4 more
2016· BMC Biology401doi:10.1186/s12915-015-0223-4

BACKGROUND: An emerging cavefish model, the cyprinid genus Sinocyclocheilus, is endemic to the massive southwestern karst area adjacent to the Qinghai-Tibetan Plateau of China. In order to understand whether orogeny influenced the evolution of these species, and how genomes change under isolation, especially in subterranean habitats, we performed whole-genome sequencing and comparative analyses of three species in this genus, S. grahami, S. rhinocerous and S. anshuiensis. These species are surface-dwelling, semi-cave-dwelling and cave-restricted, respectively. RESULTS: The assembled genome sizes of S. grahami, S. rhinocerous and S. anshuiensis are 1.75 Gb, 1.73 Gb and 1.68 Gb, respectively. Divergence time and population history analyses of these species reveal that their speciation and population dynamics are correlated with the different stages of uplifting of the Qinghai-Tibetan Plateau. We carried out comparative analyses of these genomes and found that many genetic changes, such as gene loss (e.g. opsin genes), pseudogenes (e.g. crystallin genes), mutations (e.g. melanogenesis-related genes), deletions (e.g. scale-related genes) and down-regulation (e.g. circadian rhythm pathway genes), are possibly associated with the regressive features (such as eye degeneration, albinism, rudimentary scales and lack of circadian rhythms), and that some gene expansion (e.g. taste-related transcription factor gene) may point to the constructive features (such as enhanced taste buds) which evolved in these cave fishes. CONCLUSION: As the first report on cavefish genomes among distinct species in Sinocyclocheilus, our work provides not only insights into genetic mechanisms of cave adaptation, but also represents a fundamental resource for a better understanding of cavefish biology.

Exploring genetic variation in the tomato (<i>Solanum</i> section <i>Lycopersicon</i>) clade by whole‐genome sequencing
Saulo Aflitos, Elio Schijlen, Hans de Jong, Dick de Ridder +4 more
2014· The Plant Journal392doi:10.1111/tpj.12616

We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies.

Genome evolution and diversity of wild and cultivated potatoes
Dié Tang, Yuxin Jia, Jinzhe Zhang, Hongbo Li +4 more
2022· Nature384doi:10.1038/s41586-022-04822-x

Abstract Potato ( Solanum tuberosum L.) is the world’s most important non-cereal food crop, and the vast majority of commercially grown cultivars are highly heterozygous tetraploids. Advances in diploid hybrid breeding based on true seeds have the potential to revolutionize future potato breeding and production 1–4 . So far, relatively few studies have examined the genome evolution and diversity of wild and cultivated landrace potatoes, which limits the application of their diversity in potato breeding. Here we assemble 44 high-quality diploid potato genomes from 24 wild and 20 cultivated accessions that are representative of Solanum section Petota , the tuber-bearing clade, as well as 2 genomes from the neighbouring section, Etuberosum . Extensive discordance of phylogenomic relationships suggests the complexity of potato evolution. We find that the potato genome substantially expanded its repertoire of disease-resistance genes when compared with closely related seed-propagated solanaceous crops, indicative of the effect of tuber-based propagation strategies on the evolution of the potato genome. We discover a transcription factor that determines tuber identity and interacts with the mobile tuberization inductive signal SP6A. We also identify 561,433 high-confidence structural variants and construct a map of large inversions, which provides insights for improving inbred lines and precluding potential linkage drag, as exemplified by a 5.8-Mb inversion that is associated with carotenoid content in tubers. This study will accelerate hybrid potato breeding and enrich our understanding of the evolution and biology of potato as a global staple food crop.

Anthoceros genomes illuminate the origin of land plants and the unique biology of hornworts
Fay‐Wei Li, Tomoaki Nishiyama, Manuel Waller, Eftychios Frangedakis +4 more
2020· Nature Plants381doi:10.1038/s41477-020-0618-2

Hornworts comprise a bryophyte lineage that diverged from other extant land plants >400 million years ago and bears unique biological features, including a distinct sporophyte architecture, cyanobacterial symbiosis and a pyrenoid-based carbon-concentrating mechanism (CCM). Here, we provide three high-quality genomes of Anthoceros hornworts. Phylogenomic analyses place hornworts as a sister clade to liverworts plus mosses with high support. The Anthoceros genomes lack repeat-dense centromeres as well as whole-genome duplication, and contain a limited transcription factor repertoire. Several genes involved in angiosperm meristem and stomatal function are conserved in Anthoceros and upregulated during sporophyte development, suggesting possible homologies at the genetic level. We identified candidate genes involved in cyanobacterial symbiosis and found that LCIB, a Chlamydomonas CCM gene, is present in hornworts but absent in other plant lineages, implying a possible conserved role in CCM function. We anticipate that these hornwort genomes will serve as essential references for future hornwort research and comparative studies across land plants.

Resequencing of 414 cultivated and wild watermelon accessions identifies selection for fruit quality traits
Shaogui Guo, Shengjie Zhao, Honghe Sun, Xin Wang +4 more
2019· Nature Genetics372doi:10.1038/s41588-019-0518-4

Fruit characteristics of sweet watermelon are largely the result of human selection. Here we report an improved watermelon reference genome and whole-genome resequencing of 414 accessions representing all extant species in the Citrullus genus. Population genomic analyses reveal the evolutionary history of Citrullus, suggesting independent evolutions in Citrullus amarus and the lineage containing Citrullus lanatus and Citrullus mucosospermus. Our findings indicate that different loci affecting watermelon fruit size have been under selection during speciation, domestication and improvement. A non-bitter allele, arising in the progenitor of sweet watermelon, is largely fixed in C. lanatus. Selection for flesh sweetness started in the progenitor of C. lanatus and continues through modern breeding on loci controlling raffinose catabolism and sugar transport. Fruit flesh coloration and sugar accumulation might have co-evolved through shared genetic components including a sugar transporter gene. This study provides valuable genomic resources and sheds light on watermelon speciation and breeding history.

Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis
Xingtan Zhang, Shuai Chen, Longqing Shi, Daping Gong +4 more
2021· Nature Genetics370doi:10.1038/s41588-021-00895-y

Tea is an important global beverage crop and is largely clonally propagated. Despite previous studies on the species, its genetic and evolutionary history deserves further research. Here, we present a haplotype-resolved assembly of an Oolong tea cultivar, Tieguanyin. Analysis of allele-specific expression suggests a potential mechanism in response to mutation load during long-term clonal propagation. Population genomic analysis using 190 Camellia accessions uncovered independent evolutionary histories and parallel domestication in two widely cultivated varieties, var. sinensis and var. assamica. It also revealed extensive intra- and interspecific introgressions contributing to genetic diversity in modern cultivars. Strong signatures of selection were associated with biosynthetic and metabolic pathways that contribute to flavor characteristics as well as genes likely involved in the Green Revolution in the tea industry. Our results offer genetic and molecular insights into the evolutionary history of Camellia sinensis and provide genomic resources to further facilitate gene editing to enhance desirable traits in tea crops.

NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads
Jiang Hu, Zhuo Wang, Zongyi Sun, Benxia Hu +4 more
2024· Genome biology369doi:10.1186/s13059-024-03252-4

Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.

The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution
Guoqiang Zhang, Qing Xu, Chao Bian, Wen‐Chieh Tsai +4 more
2016· Scientific Reports368doi:10.1038/srep19029

Orchids make up about 10% of all seed plant species, have great economical value, and are of specific scientific interest because of their renowned flowers and ecological adaptations. Here, we report the first draft genome sequence of a lithophytic orchid, Dendrobium catenatum. We predict 28,910 protein-coding genes, and find evidence of a whole genome duplication shared with Phalaenopsis. We observed the expansion of many resistance-related genes, suggesting a powerful immune system responsible for adaptation to a wide range of ecological niches. We also discovered extensive duplication of genes involved in glucomannan synthase activities, likely related to the synthesis of medicinal polysaccharides. Expansion of MADS-box gene clades ANR1, StMADS11, and MIKC(*), involved in the regulation of development and growth, suggests that these expansions are associated with the astonishing diversity of plant architecture in the genus Dendrobium. On the contrary, members of the type I MADS box gene family are missing, which might explain the loss of the endospermous seed. The findings reported here will be important for future studies into polysaccharide synthesis, adaptations to diverse environments and flower architecture of Orchidaceae.

A super pan-genomic landscape of rice
Lianguang Shang, Xiaoxia Li, Huiying He, Qiaoling Yuan +4 more
2022· Cell Research363doi:10.1038/s41422-022-00685-z

Pan-genomes from large natural populations can capture genetic diversity and reveal genomic complexity. Using de novo long-read assembly, we generated a graph-based super pan-genome of rice consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice. Our pan-genome reveals extensive structural variations (SVs) and gene presence/absence variations. Additionally, our pan-genome enables the accurate identification of nucleotide-binding leucine-rich repeat genes and characterization of their inter- and intraspecific diversity. Moreover, we uncovered grain weight-associated SVs which specify traits by affecting the expression of their nearby genes. We characterized genetic variants associated with submergence tolerance, seed shattering and plant architecture and found independent selection for a common set of genes that drove adaptation and domestication in Asian and African rice. This super pan-genome facilitates pinpointing of lineage-specific haplotypes for trait-associated genes and provides insights into the evolutionary events that have shaped the genomic architecture of various rice species.

The chicken gut metagenome and the modulatory effects of plant-derived benzylisoquinoline alkaloids
Peng Huang, Yan Zhang, Kangpeng Xiao, Fan Jiang +4 more
2018· Microbiome360doi:10.1186/s40168-018-0590-5

BACKGROUND: Sub-therapeutic antibiotics are widely used as growth promoters in the poultry industry; however, the resulting antibiotic resistance threatens public health. A plant-derived growth promoter, Macleaya cordata extract (MCE), with effective ingredients of benzylisoquinoline alkaloids, is a potential alternative to antibiotic growth promoters. Altered intestinal microbiota play important roles in growth promotion, but the underlying mechanism remains unknown. RESULTS: We generated 1.64 terabases of metagenomic data from 495 chicken intestinal digesta samples and constructed a comprehensive chicken gut microbial gene catalog (9.04 million genes), which is also the first gene catalog of an animal's gut microbiome that covers all intestinal compartments. Then, we identified the distinctive characteristics and temporal changes in the foregut and hindgut microbiota. Next, we assessed the impact of MCE on chickens and gut microbiota. Chickens fed with MCE had improved growth performance, and major microbial changes were confined to the foregut, with the predominant role of Lactobacillus being enhanced, and the amino acids, vitamins, and secondary bile acids biosynthesis pathways being upregulated, but lacked the accumulation of antibiotic-resistance genes. In comparison, treatment with chlortetracycline similarly enriched some biosynthesis pathways of nutrients in the foregut microbiota, but elicited an increase in antibiotic-producing bacteria and antibiotic-resistance genes. CONCLUSION: The reference gene catalog of the chicken gut microbiome is an important supplement to animal gut metagenomes. Metagenomic analysis provides insights into the growth-promoting mechanism of MCE, and underscored the importance of utilizing safe and effective growth promoters.

JCVI: A versatile toolkit for comparative genomics analysis
Haibao Tang, Vivek Krishnakumar, Xiaofei Zeng, Zhou-Geng Xu +4 more
2024· iMeta358doi:10.1002/imt2.211

The life cycle of genome builds spans interlocking pillars of assembly, annotation, and comparative genomics to drive biological insights. While tools exist to address each pillar separately, there is a growing need for tools to integrate different pillars of a genome project holistically. For example, comparative approaches can provide quality control of assembly or annotation; genome assembly, in turn, can help to identify artifacts that may complicate the interpretation of genome comparisons. The JCVI library is a versatile Python-based library that offers a suite of tools that excel across these pillars. Featuring a modular design, the JCVI library provides high-level utilities for tasks such as format parsing, graphics generation, and manipulation of genome assemblies and annotations. Supporting genomics algorithms like MCscan and ALLMAPS are widely employed in building genome releases, producing publication-ready figures for quality assessment and evolutionary inference. Developed and maintained collaboratively, the JCVI library emphasizes quality and reusability.

DBG2OLC: Efficient Assembly of Large Genomes Using Long Erroneous Reads of the Third Generation Sequencing Technologies
Chengxi Ye, C. Hill, Shigang Wu, Jue Ruan +1 more
2016· Scientific Reports351doi:10.1038/srep31900

The highly anticipated transition from next generation sequencing (NGS) to third generation sequencing (3GS) has been difficult primarily due to high error rates and excessive sequencing cost. The high error rates make the assembly of long erroneous reads of large genomes challenging because existing software solutions are often overwhelmed by error correction tasks. Here we report a hybrid assembly approach that simultaneously utilizes NGS and 3GS data to address both issues. We gain advantages from three general and basic design principles: (i) Compact representation of the long reads leads to efficient alignments. (ii) Base-level errors can be skipped; structural errors need to be detected and corrected. (iii) Structurally correct 3GS reads are assembled and polished. In our implementation, preassembled NGS contigs are used to derive the compact representation of the long reads, motivating an algorithmic conversion from a de Bruijn graph to an overlap graph, the two major assembly paradigms. Moreover, since NGS and 3GS data can compensate for each other, our hybrid assembly approach reduces both of their sequencing requirements. Experiments show that our software is able to assemble mammalian-sized genomes orders of magnitude more quickly than existing methods without consuming a lot of memory, while saving about half of the sequencing cost.

IPA1 functions as a downstream transcription factor repressed by D53 in strigolactone signaling in rice
Xiaoguang Song, Zefu Lu, Hong Yu, Gaoneng Shao +4 more
2017· Cell Research300doi:10.1038/cr.2017.102

Strigolactones (SLs), a group of carotenoid derived terpenoid lactones, are root-to-shoot phytohormones suppressing shoot branching by inhibiting the outgrowth of axillary buds. DWARF 53 (D53), the key repressor of the SL signaling pathway, is speculated to regulate the downstream transcriptional network of the SL response. However, no downstream transcription factor targeted by D53 has yet been reported. Here we report that Ideal Plant Architecture 1 (IPA1), a key regulator of the plant architecture in rice, functions as a direct downstream component of D53 in regulating tiller number and SL-induced gene expression. We showed that D53 interacts with IPA1 in vivo and in vitro and suppresses the transcriptional activation activity of IPA1. We further showed that IPA1 could directly bind to the D53 promoter and plays a critical role in the feedback regulation of SL-induced D53 expression. These findings reveal that IPA1 is likely one of the long-speculated transcription factors that act with D53 to mediate the SL-regulated tiller development in rice.