NobleBlocks

Beijing Institute of Genomics

facilityBeijing, China

Research output, citation impact, and the most-cited recent papers from Beijing Institute of Genomics (China). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
5.0K
Citations
2.4M
h-index
609
i10-index
9.2K
Also known as
Beijing Institute of Genomics北京基因组研究所

Top-cited papers from Beijing Institute of Genomics

The Sequence Alignment/Map format and SAMtools
Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell +4 more
2009· Bioinformatics67.0Kdoi:10.1093/bioinformatics/btp352

SUMMARY: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. AVAILABILITY: http://samtools.sourceforge.net.

Initial sequencing and analysis of the human genome
Eric S. Lander, Lauren Linton, Bruce W. Birren, Chad Nusbaum +4 more
2001· Nature24.5Kdoi:10.1038/35057062

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

A Draft Sequence of the Rice Genome ( <i>Oryza sativa</i> L. ssp. <i>indica</i> )
Jun Yu, Songnian Hu, Jun Wang, Gane Ka‐Shu Wong +4 more
2002· Science4.3Kdoi:10.1126/science.1068037

The genome of the japonica subspecies of rice, an important cereal and model monocot, was sequenced and assembled by whole-genome shotgun sequencing. The assembled sequence covers 93% of the 420-megabase genome. Gene predictions on the assembled sequence suggest that the genome contains 32,000 to 50,000 genes. Homologs of 98% of the known maize, wheat, and barley proteins are found in rice. Synteny and gene homology between rice and the other cereal genomes are extensive, whereas synteny with Arabidopsis is limited. Assignment of candidate rice orthologs to Arabidopsis genes is possible in many cases. The rice genome sequence provides a foundation for the improvement of cereals, our most important crops.

Systematic Localization of Common Disease-Associated Variation in Regulatory DNA
Matthew T. Maurano, Richard Humbert, Eric Rynes, Robert E. Thurman +4 more
2012· Science4.0Kdoi:10.1126/science.1222794

Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.

The accessible chromatin landscape of the human genome
Robert E. Thurman, Eric Rynes, Richard Humbert, Jeff Vierstra +4 more
2012· Nature2.9Kdoi:10.1038/nature11232

DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ∼2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect ∼580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation. An extensive map of human DNase I hypersensitive sites, markers of regulatory DNA, in 125 diverse cell and tissue types is described; integration of this information with other ENCODE-generated data sets identifies new relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. This paper describes the first extensive map of human DNaseI hypersensitive sites — markers of regulatory DNA — in 125 diverse cell and tissue types. Integration of this information with other data sets generated by ENCODE (Encyclopedia of DNA Elements) identified new relationships between chromatin accessibility, transcription, DNA methylation and regulatory-factor occupancy patterns. Evolutionary-conservation analysis revealed signatures of recent functional constraint within DNaseI hypersensitive sites.

Mapping short DNA sequencing reads and calling variants using mapping quality scores
Heng Li, Jue Ruan, Richard Durbin
2008· Genome Research2.7Kdoi:10.1101/gr.078212.108

New sequencing technologies promise a new era in the use of DNA sequence. However, some of these technologies produce very short reads, typically of a few tens of base pairs, and to use these reads effectively requires new algorithms and software. In particular, there is a major issue in efficiently aligning short reads to a reference genome and handling ambiguity or lack of accuracy in this alignment. Here we introduce the concept of mapping quality, a measure of the confidence that a read actually comes from the position it is aligned to by the mapping algorithm. We describe the software MAQ that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample. MAQ makes full use of mate-pair information and estimates the error probability of each read alignment. Error probabilities are also derived for the final genotype calls, using a Bayesian statistical model that incorporates the mapping qualities, error probabilities from the raw sequence quality scores, sampling of the two haplotypes, and an empirical model for correlated errors at a site. Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net.

Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition)<sup>1</sup>
Daniel J. Klionsky, Amal Kamal Abdel‐Aziz, Sara Abdelfatah, Mahmoud Abdellatif +4 more
2021· Autophagy2.6Kdoi:10.1080/15548627.2020.1797280

autophagic responses. Here, we critically discuss current methods of assessing autophagy and the information they can, or cannot, provide. Our ultimate goal is to encourage intellectual and technical innovation in the field.

Mammalian WTAP is a regulatory subunit of the RNA N6-methyladenosine methyltransferase
Xiaoli Ping, Baofa Sun, Lu Wang, Wen Xiao +4 more
2014· Cell Research2.4Kdoi:10.1038/cr.2014.3

The methyltransferase like 3 (METTL3)-containing methyltransferase complex catalyzes the N6-methyladenosine (m6A) formation, a novel epitranscriptomic marker; however, the nature of this complex remains largely unknown. Here we report two new components of the human m6A methyltransferase complex, Wilms' tumor 1-associating protein (WTAP) and methyltransferase like 14 (METTL14). WTAP interacts with METTL3 and METTL14, and is required for their localization into nuclear speckles enriched with pre-mRNA processing factors and for catalytic activity of the m6A methyltransferase in vivo. The majority of RNAs bound by WTAP and METTL3 in vivo represent mRNAs containing the consensus m6A motif. In the absence of WTAP, the RNA-binding capability of METTL3 is strongly reduced, suggesting that WTAP may function to regulate recruitment of the m6A methyltransferase complex to mRNA targets. Furthermore, transcriptomic analyses in combination with photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) illustrate that WTAP and METTL3 regulate expression and alternative splicing of genes involved in transcription and RNA processing. Morpholino-mediated knockdown targeting WTAP and/or METTL3 in zebrafish embryos caused tissue differentiation defects and increased apoptosis. These findings provide strong evidence that WTAP may function as a regulatory subunit in the m6A methyltransferase complex and play a critical role in epitranscriptomic regulation of RNA metabolism.

Exosome and Exosomal MicroRNA: Trafficking, Sorting, and Function
Jian Zhang, Sha Li, Lu Li, Meng Li +3 more
2015· Genomics Proteomics & Bioinformatics2.2Kdoi:10.1016/j.gpb.2015.02.001

Exosomes are 40-100 nm nano-sized vesicles that are released from many cell types into the extracellular space. Such vesicles are widely distributed in various body fluids. Recently, mRNAs and microRNAs (miRNAs) have been identified in exosomes, which can be taken up by neighboring or distant cells and subsequently modulate recipient cells. This suggests an active sorting mechanism of exosomal miRNAs, since the miRNA profiles of exosomes may differ from those of the parent cells. Exosomal miRNAs play an important role in disease progression, and can stimulate angiogenesis and facilitate metastasis in cancers. In this review, we will introduce the origin and the trafficking of exosomes between cells, display current research on the sorting mechanism of exosomal miRNAs, and briefly describe how exosomes and their miRNAs function in recipient cells. Finally, we will discuss the potential applications of these miRNA-containing vesicles in clinical settings.

KaKs_Calculator 2.0: A Toolkit Incorporating Gamma-Series Methods and Sliding Window Strategies
Dapeng Wang, Yubin Zhang, Zhang Zhang, Jiang Zhu +1 more
2010· Genomics Proteomics & Bioinformatics2.2Kdoi:10.1016/s1672-0229(10)60008-3

We present an integrated stand-alone software package named KaKs_Calculator 2.0 as an updated version. It incorporates 17 methods for the calculation of nonsynonymous and synonymous substitution rates; among them, we added our modified versions of several widely used methods as the gamma series including gamma-NG, gamma-LWL, gamma-MLWL, gamma-LPB, gamma-MLPB, gamma-YN and gamma-MYN, which have been demonstrated to perform better under certain conditions than their original forms and are not implemented in the previous version. The package is readily used for the identification of positively selected sites based on a sliding window across the sequences of interests in 5' to 3' direction of protein-coding sequences, and have improved the overall performance on sequence analysis for evolution studies. A toolbox, including C++ and Java source code and executable files on both Windows and Linux platforms together with a user instruction, is downloadable from the website for academic purpose at https://sourceforge.net/projects/kakscalculator2/.

The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types
Tingting Chen, Xu Chen, Sisi Zhang, Junwei Zhu +4 more
2021· Genomics Proteomics & Bioinformatics1.9Kdoi:10.1016/j.gpb.2021.08.001

The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here we present the GSA family by expanding into a set of resources for raw data archive with different purposes, namely, GSA (https://ngdc.cncb.ac.cn/gsa/), GSA for Human (GSA-Human, https://ngdc.cncb.ac.cn/gsa-human/), and Open Archive for Miscellaneous Data (OMIX, https://ngdc.cncb.ac.cn/omix/). Compared with the 2017 version, GSA has been significantly updated in data model, online functionalities, and web interfaces. GSA-Human, as a new partner of GSA, is a data repository specialized in human genetics-related data with controlled access and security. OMIX, as a critical complement to the two resources mentioned above, is an open archive for miscellaneous data. Together, all these resources form a family of resources dedicated to archiving explosive data with diverse types, accepting data submissions from all over the world, and providing free open access to all publicly available data in support of worldwide research activities.

Dynamic transcriptomic m6A decoration: writers, erasers, readers and functions in RNA metabolism
Ying Yang, Phillip J. Hsu, Yusheng Chen, Yun‐Gui Yang
2018· Cell Research1.8Kdoi:10.1038/s41422-018-0040-8

Abstract N 6 -methyladenosine (m 6 A) is a chemical modification present in multiple RNA species, being most abundant in mRNAs. Studies on enzymes or factors that catalyze, recognize, and remove m 6 A have revealed its comprehensive roles in almost every aspect of mRNA metabolism, as well as in a variety of physiological processes. This review describes the current understanding of the m 6 A modification, particularly the functions of its writers, erasers, readers in RNA metabolism, with an emphasis on its role in regulating the isoform dosage of mRNAs.

Gut microbiota dysbiosis contributes to the development of hypertension
Jing Li, Fangqing Zhao, Yidan Wang, Junru Chen +4 more
2017· Microbiome1.7Kdoi:10.1186/s40168-016-0222-x

BACKGROUND: Recently, the potential role of gut microbiome in metabolic diseases has been revealed, especially in cardiovascular diseases. Hypertension is one of the most prevalent cardiovascular diseases worldwide, yet whether gut microbiota dysbiosis participates in the development of hypertension remains largely unknown. To investigate this issue, we carried out comprehensive metagenomic and metabolomic analyses in a cohort of 41 healthy controls, 56 subjects with pre-hypertension, 99 individuals with primary hypertension, and performed fecal microbiota transplantation from patients to germ-free mice. RESULTS: Compared to the healthy controls, we found dramatically decreased microbial richness and diversity, Prevotella-dominated gut enterotype, distinct metagenomic composition with reduced bacteria associated with healthy status and overgrowth of bacteria such as Prevotella and Klebsiella, and disease-linked microbial function in both pre-hypertensive and hypertensive populations. Unexpectedly, the microbiome characteristic in pre-hypertension group was quite similar to that in hypertension. The metabolism changes of host with pre-hypertension or hypertension were identified to be closely linked to gut microbiome dysbiosis. And a disease classifier based on microbiota and metabolites was constructed to discriminate pre-hypertensive and hypertensive individuals from controls accurately. Furthermore, by fecal transplantation from hypertensive human donors to germ-free mice, elevated blood pressure was observed to be transferrable through microbiota, and the direct influence of gut microbiota on blood pressure of the host was demonstrated. CONCLUSIONS: Overall, our results describe a novel causal role of aberrant gut microbiota in contributing to the pathogenesis of hypertension. And the significance of early intervention for pre-hypertension was emphasized.

A map of rice genome variation reveals the origin of cultivated rice
Xuehui Huang, Nori Kurata, Xinghua Wei, Zi-Xuan Wang +4 more
2012· Nature1.7Kdoi:10.1038/nature11532

Crop domestications are long-term selection experiments that have greatly advanced human civilization. The domestication of cultivated rice (Oryza sativa L.) ranks as one of the most important developments in history. However, its origins and domestication processes are controversial and have long been debated. Here we generate genome sequences from 446 geographically diverse accessions of the wild rice species Oryza rufipogon, the immediate ancestral progenitor of cultivated rice, and from 1,083 cultivated indica and japonica varieties to construct a comprehensive map of rice genome variation. In the search for signatures of selection, we identify 55 selective sweeps that have occurred during domestication. In-depth analyses of the domestication sweeps and genome-wide patterns reveal that Oryza sativa japonica rice was first domesticated from a specific population of O. rufipogon around the middle area of the Pearl River in southern China, and that Oryza sativa indica rice was subsequently developed from crosses between japonica rice and local wild rice as the initial cultivars spread into South East and South Asia. The domestication-associated traits are analysed through high-resolution genetic mapping. This study provides an important resource for rice breeding and an effective genomics approach for crop domestication research. Whole-genome sequences of wild rice and cultivated rice varieties are used to produce a map of rice genome variation, and show that rice was probably first domesticated in southern China. Cultivated rice (Oryza sativa) is thought to have been domesticated from wild rice (Oryza rufipogon) thousands of years ago. This Chinese/Japanese collaboration reports whole-genome sequences from 446 wild rice isolates from across Asia and Oceana, and from more than 1,000 indica and japonica subspecies of cultivated rice. The resulting map of genome variation will be an important resource for rice breeding and for crop-domestication research.

Parts per Million Mass Accuracy on an Orbitrap Mass Spectrometer via Lock Mass Injection into a C-trap
Jesper V. Olsen, Lyris Martins Franco de Godoy, Guoqing Li, Boris Maček +4 more
2005· Molecular & Cellular Proteomics1.5Kdoi:10.1074/mcp.t500030-mcp200

Mass accuracy is a key parameter of mass spectrometric performance. TOF instruments can reach low parts per million, and FT-ICR instruments are capable of even greater accuracy provided ion numbers are well controlled. Here we demonstrate sub-ppm mass accuracy on a linear ion trap coupled via a radio frequency-only storage trap (C-trap) to the orbitrap mass spectrometer (LTQ Orbitrap). Prior to acquisition of a spectrum, a background ion originating from ambient air is first transferred to the C-trap. Ions forming the MS or MSn spectrum are then added to this species, and all ions are injected into the orbitrap for analysis. Real time recalibration on the “lock mass” by corrections of mass shift removes mass error associated with calibration of the mass scale. The remaining mass error is mainly due to imperfect peaks caused by weak signals and is addressed by averaging the mass measurement over the LC peak, weighted by signal intensity. For peptide database searches in proteomics, we introduce a variable mass tolerance and achieve average absolute mass deviations of 0.48 ppm (standard deviation 0.38 ppm) and maximal deviations of less than 2 ppm. For tandem mass spectra we demonstrate similarly high mass accuracy and discuss its impact on database searching. High and routine mass accuracy in a compact instrument will dramatically improve certainty of peptide and small molecule identification. Mass accuracy is a key parameter of mass spectrometric performance. TOF instruments can reach low parts per million, and FT-ICR instruments are capable of even greater accuracy provided ion numbers are well controlled. Here we demonstrate sub-ppm mass accuracy on a linear ion trap coupled via a radio frequency-only storage trap (C-trap) to the orbitrap mass spectrometer (LTQ Orbitrap). Prior to acquisition of a spectrum, a background ion originating from ambient air is first transferred to the C-trap. Ions forming the MS or MSn spectrum are then added to this species, and all ions are injected into the orbitrap for analysis. Real time recalibration on the “lock mass” by corrections of mass shift removes mass error associated with calibration of the mass scale. The remaining mass error is mainly due to imperfect peaks caused by weak signals and is addressed by averaging the mass measurement over the LC peak, weighted by signal intensity. For peptide database searches in proteomics, we introduce a variable mass tolerance and achieve average absolute mass deviations of 0.48 ppm (standard deviation 0.38 ppm) and maximal deviations of less than 2 ppm. For tandem mass spectra we demonstrate similarly high mass accuracy and discuss its impact on database searching. High and routine mass accuracy in a compact instrument will dramatically improve certainty of peptide and small molecule identification. The data produced by a mass spectrometer are the mass and intensity of compounds and their fragments. The accuracy of mass measurement directly determines the usefulness of mass spectrometric experiments, and much effort in instrumentation development is directed at improving this key parameter. Mass accuracy and mass resolution are connected, and instruments introduced during the last decades radically improved in these two attributes. Traditionally accurate mass measurements, sufficient to determine the elemental composition of small molecules, were the province of magnetic sector instruments, but today TOF instruments equipped with energy correcting reflectrons can reach low ppm values. Triple quadrupole instruments or quadrupole ion traps, which are popular in proteomics research, however, have low resolution and mass uncertainties of typically half to several Da. At the other extreme, FT-ICR mass spectrometers reduce the mass measurement to a frequency measurement and are therefore potentially capable of exceedingly high mass accuracy. In practice, however, FT-ICR instruments have suffered from the requirement to precisely control the number of ions accumulated in the Penning trap. Over or under filling leads to mass shifts to high and low values, respectively. For example, Smith and co-workers (1Belov M.E. Zhang R. Strittmatter E.F. Prior D.C. Tang K. Smith R.D. Automated gain control and internal calibration with external ion accumulation capillary liquid chromatography-electrospray ionization Fourier transform ion cyclotron resonance.Anal. Chem. 2003; 75: 4195-4205Google Scholar) reported that in their measurements the mass determined over an LC peak varied by more than 10 ppm. The recent introduction of a linear ion trap-FT-ICR combination (2Syka J.E.P. Marto J.A. Bai D.L. Horning S. Senko M.W. Schwartz J.C. Ueberheide B. Garcia B. Busby S. Muratore T. Shabanowitz J. Hunt D.F. Novel Linear Quadrupole Ion Trap/FT Mass Spectrometer: Performance Characterization and Use in the Comparative Analysis of Histone H3 Post-translational Modifications.J. Proteome Res. 2004; 3: 621-626Google Scholar) largely solved this problem through a prescan in the ion trap to estimate ion current (called automatic gain control), which allows filling of the ICR cell with a predetermined number of ions. Using automatic gain control and narrow mass ranges (SIM 1The abbreviations used are: SIM, selected ion monitoring; MS/MS, tandem MS; LTQ, Thermo Electron linear quadrupole ion trap; RF, radio frequency; SILAC, stable isotope labeling by amino acids in cell culture; PCM, polycyclodimethylsiloxane. scans) we observed an average absolute mass error between 0.6 and 0.7 ppm in recent large scale proteomic analyses (3Olsen J.V. Ong S.E. Mann M. Trypsin cleaves exclusively C-terminal to arginine and lysine residues.Mol. Cell Proteomics. 2004; 3: 608-614Google Scholar, 4Andersen J.S. Lam Y.W. Leung A.K. Ong S.E. Lyon C.E. Lamond A.I. Mann M. Nucleolar proteome dynamics.Nature. 2005; 433: 77-83Google Scholar, 5Gruhler A. Olsen J.V. Mohammed S. Mortensen P. Faergeman N.J. Mann M. Jensen O.N. Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway.Mol. Cell Proteomics. 2005; 4: 310-327Google Scholar). In a typical proteomics experiment, protein mixtures are digested to peptide mixtures that are separated by reversed phase HPLC and analyzed on-line by MS and MS/MS (6Aebersold R. Mann M. Mass spectrometry-based proteomics.Nature. 2003; 422: 198-207Google Scholar). The mass accuracy achieved in the instrument directly translates into the mass tolerances that can be specified in subsequent database searches of tandem mass spectra. Unambiguous protein identification in large data sets is by no means trivial (7Steen H. Mann M. The abc’s (and xyz’s) of peptide sequencing.Nat. Rev. Mol. Cell. Biol. 2004; 5: 699-711Google Scholar), and any increase in achieved mass accuracy greatly aids the specificity of database searches in two ways (8Jensen O.N. Podtelejnikov A. Mann M. Delayed Extraction Improves Specificity in Database Searches by MALDI Peptide Maps.Rapid Commun. Mass Spectrom. 1996; 10: 1371-1378Google Scholar, 9Clauser K.R. Baker P. Burlingame A.L. Role of accurate mass measurement (+/− 10 ppm) in protein identification strategies employing MS or MS/MS and database searching.Anal. Chem. 1999; 71: 2871-2882Google Scholar): High precursor mass accuracy in the MS spectra directly translates into fewer “candidate sequences” that need to be considered as possible matches. High mass accuracy in the MS/MS spectra leads to fewer measured fragment masses that match the calculated fragments of a candidate sequence by chance and therefore decreases the scores of false positives in database search algorithms. In 1923, Kingdon (10Kingdon K. A method for the neutralization of electron space charge by positive ionization at very low gas pressures.Phys. Rev. 1923; 21: 408-418Google Scholar) devised a method to capture ions by causing them to orbit around a central electrode. Since then, the physics community has used “Kingdon traps” in a variety of experiments, but it was always used as a capturing device, not as a mass spectrometer. A few years ago a mass which the orbitrap A. a of mass Chem. Scholar, M. A. the orbitrap mass to an ion Chem. 2003; 75: Scholar, H. A. M. R. The a mass Mass Spectrom. 2005; Scholar). the of this a mass spectrometer was very introduced of the linear ion trap coupled to a radio frequency for storage of ions and of the orbitrap mass (LTQ and S. of Mass in Mass Spectrom. Scholar). In to the Penning used in the orbitrap of two around which injected ions magnetic are and the of the is a few The is an current of the of ion the and the mass spectrum is as the Fourier transform of this a the has high and it be capable of high mass accuracy. mass accuracy on the of an which is more to achieve than of a magnetic of the mass accuracy of an orbitrap for proteomic have reported Here we that very high mass accuracy is possible on the a background ion produced by in ambient and a number of this ion into the the trap the to the The ions the mass spectrum are then added to this “lock and all are injected into the orbitrap average the mass over the of peptide these average absolute mass in the sub-ppm for at The masses of to the are determined to a few ppm. of was in a 2 and and digested as Mann M. proteomics of high specificity for signaling A. 2003; Scholar). reduce was added to a of 10 in the protein and for at in the The were with for at The and protein mixtures were digested and the peptide mixtures were on as J. Mann M. and for and in Chem. 2003; 75: Scholar) and in for analysis. The which has a and which is therefore an for was from of yeast were in yeast liquid or for 10 of the and yeast determined by were then by for at at two with by and for protein Cell were by in a The yeast was to the was transferred to a and the protein in the was determined by were separated by and to the The was with the were and digested with were into and with For protein were with 10 in for at of was by the with in for at in the were two with with and in a The were with in and for at for protein were transferred to and the remaining were by two with in by with The were and and the used for mass spectrometric analysis. digested peptide mixtures were separated by on-line and analyzed by tandem mass The were on an to an mass spectrometer equipped with a ion and of the in a from with The peptide mixtures were injected the with a of and with a of from in were for and for the yeast The mass spectrometer was in the to between and MS spectra were in the orbitrap with resolution at accumulation to a of in the linear ion The ions to on signal were for in the linear ion trap at a of The fragment ions were in the orbitrap with resolution at For accurate mass measurements the mass was in MS and MS/MS and the ions in the from ambient air A. R. in the ambient air as of background signals in mass Mass Spectrom. 2003; Scholar) were used for internal recalibration in For of the mass into the the mass was at of the of the mass The mass time orbitrap and subsequent spectra was in The time for the and into the of the mass was to be a few at no time in with and mass was in MS/MS the ion at with was used for ions selected for MS/MS were for mass spectrometric no and gas ion gas for Ion was for and time of was applied for were by database an of the protein sequence database or of the for yeast database was with observed and specified an MS tolerance of 10 ppm and an MS/MS tolerance at possible in and specificity for to 2 of was as a and of and were as variable of the high mass the in the yeast database search was a of even into the high fragment mass and with a greater than this were for for peptide mass due to imperfect peaks caused by weak signals we have a that all MS mass measurements of a peptide ion over the LC peak, weighted by signal intensity. For any data all peak and charge of the mass to the peptide is in the measured peptide masses are over the weighted by its signal intensity in The was in and with Thermo Electron data the a in which all peptide ion masses have with the are in and peak for the yeast in the are in the of the orbitrap mass spectrometer have in the A. a of mass Chem. Scholar, M. A. the orbitrap mass to an ion Chem. 2003; 75: Scholar, H. A. M. R. The a mass Mass Spectrom. 2005; Scholar), we the for of the mass accuracy addressed can be in it of The is a capable of MS and MSn spectra at very high but low resolution and mass accuracy. Ions accumulated in the can be transferred into a an quadrupole that and the ions. The to the of the in the of the In the ions are by a low of and to in the of the trap. are injected into the orbitrap high with few Ions are then in the orbitrap by the of the with of the ion Ions to around the and into that the in The of is determined by the of ion from the orbitrap of is by the of injected ion The frequency of this is to the of the mass to charge of the ions and is by a to of the the For a a is produced and the signals of an ion can be into a mass spectrum by Fourier ions are to in the and therefore the the and In the can be with a not with the of the mass of 10 and the of the which has not determined but which is than the of the C-trap. is used as a gas it is used in the of the instrument and it is at capturing and the ions. In the the linear ion trap and fragments it be possible to fragment ions in the it is more and to them in the linear ion trap and in the C-trap. employing the linear ion trap as a the well low mass for fragments to the as The mass of the to in the will be the as in the in is to ion J.C. Senko M.W. A quadrupole ion trap mass Mass Spectrom. Scholar, K. of a ion trap and a ion trap mass spectrometer in Cell Proteomics. 2005; 4: the orbitrap as a high mass in the we injected of a of a a and measured MS and MS/MS spectra with the orbitrap as the were ions for MS and ions for MS/MS in the Analysis of fragments was in the the of the can be with a between acquisition and high to resolution at and MS/MS to a resolution of time for by MS/MS for to of the ions was on from orbitrap accumulation for MS and for and the ion a of a typical mass High and signal to are of the a typical MS/MS spectrum with resolution and a of The that charge of fragment ions is trivial the high resolution and peak in the LTQ, the can to fragments in the or in the orbitrap can be in the the of all ions are as the ion is more in the spectrum, that this into the at the in The between and MS/MS spectra with that on an mass ranges were than and that it was to mass ranges to a MS/MS spectrum J. and M. of high low mass accuracy MS/MS for in Mass Spectrom. Scholar). is by the of ion caused by over the from the to the and from the to the of a spectrum of a peptide ion current of the MS/MS spectrum of a peptide in the and analyzed in the of the ion at Da. that the isotope it to the charge of MS/MS analyzed in the orbitrap in the by 10 The the MS/MS spectrum in the ion in the orbitrap was very as in the observed that of were sufficient to signal to MS/MS spectra. In this experiment, were of the of have even were not for as as the signal is at the of the the as a we of or to filling that are in the of mixtures the is typically by the time in acquisition of the than the ion as was the in this we observed that MS/MS spectra in the orbitrap are less than spectra in the is caused by the high resolution of the orbitrap and its current At ions of the are for and S. of Mass in Mass Spectrom. therefore background ions from which are not to in the spectra. The of MS/MS spectra by in the orbitrap will the acquisition therefore orbitrap MS/MS will be to MS/MS we to the mass accuracy on the mass accuracy was we observed with caused by ambient can be in that mass is much than mass which by a few parts per In mass a mass is to for and to an internal in the spectrum the of For example, was measured in the of a to a sub-ppm mass accuracy Mass for in MALDI Chem. 1999; 71: Scholar), and with a for and internal in mass C.E. calibration on with Fourier transform mass Chem. Scholar). used a ion of which has a composition of and an mass of and is in spectra A. R. in the ambient air as of background signals in mass Mass Spectrom. 2003; Scholar). in a of a few to an a number of background which can all be used as masses for internal calibration during analysis. in mass accuracy of the peptide improved from ppm to ppm by employing as In mass spectrum, we the mass to calibration and all measured by the ppm deviation for all masses as for the the measured mass of the background ion as a of The observed mass is stable less than ppm during time but in the time of spectrum of ambient a of 2 to an in the ambient of these ions can be used as masses during of of the measured mass of a peptide the mass of the background ion the of a mass is well it is not is in to the effort of a mass into the Here we PCM, which is always during the it is not always in or spectra peptide signals to the background ion signals to we of a of the its C-trap. A number of ions used a of is accumulated in the and transferred to the C-trap. In this an signal of is always of the accumulation time used for the mass number of ions not the of the C-trap. of ions and the and no more than a few to the the mass can be added to any For example, ions of can be accumulated in the and a precursor ion can be accumulated and in the LTQ, which is by of all MS/MS ions into the C-trap. of an mass or of masses has into the data and is the mass signal is from the spectra. we the of mass for the of a peptide S.E. B. H. A. Mann M. by in Cell SILAC, as a and to Cell Proteomics. Scholar) yeast was digested with and analyzed by on the The acquisition was as in mass for all spectra. identification of yeast by database mass between calculated and measured peptide masses were in that the of mass is with all or two ppm. the of the few at mass we the mass deviation as a of peptide intensity. that the are mainly caused by low and S. of Mass in Mass Spectrom. Scholar) have reported that signal to is a of achieved mass accuracy in orbitrap mass which is in with on other instruments C.E. Scholar). The data that the mass can reduce the mass error to a few ppm on the of the mass measurement that is the for peptide for improve we of the that several mass measurements are of the precursor as it from the capillary and that these measurements typically are of intensity than the used for precursor the intensity of a yeast peptide as it from the can be in the the precursor was for and its mass determined it was less than of its maximal intensity. mass measurements of the precursor were and it is from that mass accuracy is much at the of the LC peak than it is at its we a to over the LC which the mass measurements weighted by signal intensity. the of these mass accuracy was improved and was a ppm absolute deviation from the calculated values. that all the have a maximal mass error of less than 2 ppm average mass accuracy of 0.48 ppm and a deviation of 0.38 the mass can be applied for MS/MS spectra. however, that the ion of in the during the time it to and fragment the precursor we used this ion for mass scale a and of two peptide spectra in time with the mass spectra were with a of and a resolution of at the mass accuracy to be in the MS/MS with the MS of the and resolution as well as the that a tandem mass spectrum was for all fragments 2 ppm of their calculated provided that the intensity was than with less than were ppm of the calculated in all that the elemental composition of low mass ions can be determined a mass accuracy of a that can be in of tandem mass for example, to peptide sequence M. of in by Peptide Chem. Scholar), sequence B. M. Mann M. Peptide by MALDI tandem mass Proteome Res. Scholar), data sequence of peptide MS/MS data and the of MS/MS Cell Proteomics. 2005; 4: Scholar, protein identification in Fourier mass Cell Proteomics. 2005; 4: Scholar), or in composition B. peptide composition and a employing accurate mass by transform ion cyclotron mass Mass Spectrom. 2004; Scholar). orbitrap MS/MS spectra with the database the achieved mass accuracy is much than can be specified as a search parameter. the not the for mass than or scores to fragment with high mass accuracy J.S. protein identification by sequence mass 1999; Scholar). the the in to the peptide was very large with these high mass accuracy tandem mass in for spectra no peptide sequence was at on the can be in and 2 search with and LC mass and with peptide with and mass Here we a for very high mass accuracy with an mass spectrometer. a background ion of composition into the we for in the over The of this that the mass scale of the that is the of the frequency and of the is at to than per that the remaining mass error mainly on the signal intensity of the peptide For any peaks that are not to the we a mass accuracy to ppm. for weak peaks the mass accuracy is a few ppm. increase the mass we mass measurements over the LC peak weighted by signal intensity. In this several mass measurements to the and the mass is not on a signal to the as is the the precursor mass is the as the which is the for peak for of achieved mass accuracy on signal or signal to is not to the in current proteomic practice, a is The precursor and fragment mass tolerances are to even the mass mass have by recalibration J.S. A. Mann M. Analysis of the proteome by mass Scholar) or by mass as a be to very mass tolerances for well peaks and mass tolerances for weak a be in a all have with a mass In this with signal but large deviation from the calculated masses be from or at a precursor mass measurement be by its as A. A.I. R. to estimate the accuracy of peptide by MS/MS and database Chem. Scholar) can in a The mass PCM, is in other background ions be used as and it is possible to more than mass in the mass of the storage provided by the C-trap. we have used the for a number of mass it can be used for other as For example, the be with several narrow mass ranges of or several MS/MS fragment ions from precursor ions high resolution of the accumulated ions in the In with the the is capable of mass as we achieve the high mass accuracy with the of a mass in the we used the and of a narrow mass into the ICR cell (3Olsen J.V. Ong S.E. Mann M. Trypsin cleaves exclusively C-terminal to arginine and lysine residues.Mol. Cell Proteomics. 2004; 3: 608-614Google Scholar). of the mass is that no time to be on the is and with a high ion at the of its space charge to then mass accuracy is an of than we demonstrate for the is the of the high mass accuracy P. B. for Peptide Characterization by Mass Chem. 1996; Scholar) have that a mass accuracy of ppm peptide to a few and greatly peptide identification. Smith and co-workers added a time and that mass and time be sufficient for peptide identification E.F. Tang K. Smith R.D. Proteome analyses accurate mass and time peptide with capillary LC mass Mass Spectrom. 2003; Scholar). has to mass achieved to have for example, M.E. Smith R.D. A proteomic of the Proteome an accurate mass and time 2005; 5: Scholar). with the accuracy reported we not that the mass is sufficient to in typical proteomic the introduced by and peptide for as of the number of candidate peptide will be very and even low accuracy tandem mass spectra can then the candidate peptide is this mass accuracy will be in the of which a problem of the caused of S.E. Mann M. and in by 2004; Scholar). In we have that a compact mass the is capable of very high mass accuracy a mass High mass accuracy is in the MS and MS/MS and or in combination with strategies J.V. Mann M. peptide identification in proteomics by two of mass spectrometric A. 2004; Scholar) to the problem of false positive peptide identification in proteomics and to much more than in the for and Cell as well as at the for with

On the classification of long non-coding RNAs
Lina Ma, Vladimir B. Bajić, Zhang Zhang
2013· RNA Biology1.3Kdoi:10.4161/rna.24604

Long non-coding RNAs (lncRNAs) have been found to perform various functions in a wide variety of important biological processes. To make easier interpretation of lncRNA functionality and conduct deep mining on these transcribed sequences, it is convenient to classify lncRNAs into different groups. Here, we summarize classification methods of lncRNAs according to their four major features, namely, genomic location and context, effect exerted on DNA sequences, mechanism of functioning and their targeting mechanism. In combination with the presently available function annotations, we explore potential relationships between different classification categories, and generalize and compare biological features of different lncRNAs within each category. Finally, we present our view on potential further studies. We believe that the classifications of lncRNAs as indicated above are of fundamental importance for lncRNA studies, helpful for further investigation of specific lncRNAs, for formulation of new hypothesis based on different features of lncRNA and for exploration of the underlying lncRNA functional mechanisms.

KaKs_Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging
Zhang Zhang, Jun Li, Xiaoqian Zhao, Jun Wang +2 more
2006· Genomics Proteomics & Bioinformatics1.3Kdoi:10.1016/s1672-0229(07)60007-2

Abstract KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.

FTO-dependent demethylation of N6-methyladenosine regulates mRNA splicing and is required for adipogenesis
Xu Zhao, Ying Yang, Baofa Sun, Yue Shi +4 more
2014· Cell Research1.2Kdoi:10.1038/cr.2014.151

The role of Fat Mass and Obesity-associated protein (FTO) and its substrate N6-methyladenosine (m6A) in mRNA processing and adipogenesis remains largely unknown. We show that FTO expression and m6A levels are inversely correlated during adipogenesis. FTO depletion blocks differentiation and only catalytically active FTO restores adipogenesis. Transcriptome analyses in combination with m6A-seq revealed that gene expression and mRNA splicing of grouped genes are regulated by FTO. M6A is enriched in exonic regions flanking 5'- and 3'-splice sites, spatially overlapping with mRNA splicing regulatory serine/arginine-rich (SR) protein exonic splicing enhancer binding regions. Enhanced levels of m6A in response to FTO depletion promotes the RNA binding ability of SRSF2 protein, leading to increased inclusion of target exons. FTO controls exonic splicing of adipogenic regulatory factor RUNX1T1 by regulating m6A levels around splice sites and thereby modulates differentiation. These findings provide compelling evidence that FTO-dependent m6A demethylation functions as a novel regulatory mechanism of RNA processing and plays a critical role in the regulation of adipogenesis.

A Draft Sequence for the Genome of the Domesticated Silkworm ( <i>Bombyx mori</i> )
Biology analysis group, Qingyou Xia, Zeyang Zhou, Cheng Lu +4 more
2004· Science1.2Kdoi:10.1126/science.1102210

We report a draft sequence for the genome of the domesticated silkworm ( Bombyx mori ), covering 90.9% of all known silkworm genes. Our estimated gene count is 18,510, which exceeds the 13,379 genes reported for Drosophila melanogaster . Comparative analyses to fruitfly, mosquito, spider, and butterfly reveal both similarities and differences in gene content.

CIRI: an efficient and unbiased algorithm for de novo circular RNA identification
Yuan Gao, Jinfeng Wang, Fangqing Zhao
2015· Genome Biology1.2Kdoi:10.1186/s13059-014-0571-3

Recent studies reveal that circular RNAs (circRNAs) are a novel class of abundant, stable and ubiquitous noncoding RNA molecules in animals. Comprehensive detection of circRNAs from high-throughput transcriptome data is an initial and crucial step to study their biogenesis and function. Here, we present a novel chiastic clipping signal-based algorithm, CIRI, to unbiasedly and accurately detect circRNAs from transcriptome data by employing multiple filtration strategies. By applying CIRI to ENCODE RNA-seq data, we for the first time identify and experimentally validate the prevalence of intronic/intergenic circRNAs as well as fragments specific to them in the human transcriptome.