NobleBlocks

Genome Institute of Singapore

facilitySingapore, Singapore

Research output, citation impact, and the most-cited recent papers from Genome Institute of Singapore (Singapore). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
5.6K
Citations
1.7M
h-index
565
i10-index
9.0K
Also known as
Genome Institute of Singapore

Top-cited papers from Genome Institute of Singapore

Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega
Fabian Sievers, Andreas Wilm, David Dineen, Toby J. Gibson +4 more
2011· Molecular Systems Biology16.2Kdoi:10.1038/msb.2011.75

Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

Minimal information for studies of extracellular vesicles 2018 (MISEV2018): a position statement of the International Society for Extracellular Vesicles and update of the MISEV2014 guidelines
Clotilde Théry, Kenneth W. Witwer, Elena Aïkawa, María José Alcaraz +4 more
2018· Journal of Extracellular Vesicles11.0Kdoi:10.1080/20013078.2018.1535750

The last decade has seen a sharp increase in the number of scientific publications describing physiological and pathological functions of extracellular vesicles (EVs), a collective term covering various subtypes of cell-released, membranous structures, called exosomes, microvesicles, microparticles, ectosomes, oncosomes, apoptotic bodies, and many other names. However, specific issues arise when working with these entities, whose size and amount often make them difficult to obtain as relatively pure preparations, and to characterize properly. The International Society for Extracellular Vesicles (ISEV) proposed Minimal Information for Studies of Extracellular Vesicles ("MISEV") guidelines for the field in 2014. We now update these "MISEV2014" guidelines based on evolution of the collective knowledge in the last four years. An important point to consider is that ascribing a specific function to EVs in general, or to subtypes of EVs, requires reporting of specific information beyond mere description of function in a crude, potentially contaminated, and heterogeneous preparation. For example, claims that exosomes are endowed with exquisite and specific activities remain difficult to support experimentally, given our still limited knowledge of their specific molecular machineries of biogenesis and release, as compared with other biophysically similar EVs. The MISEV2018 guidelines include tables and outlines of suggested protocols and steps to follow to document specific EV-associated functional activities. Finally, a checklist is provided with summaries of key points.

Landscape of transcription in human cells
Sarah Djebali, Carrie Davis, Angelika Merkel, Alexander Dobin +4 more
2012· Nature5.4Kdoi:10.1038/nature11233

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene. A description is given of the ENCODE effort to provide a complete catalogue of primary and processed RNAs found either in specific subcellular compartments or throughout the cell, revealing that three-quarters of the human genome can be transcribed, and providing a wealth of information on the range and levels of expression, localization, processing fates and modifications of known and previously unannotated RNAs. These authors describe the ENCODE (Encyclopedia of DNA Elements) effort to provide a complete catalogue of primary and processed RNAs found either in specific sub-cellular compartments or throughout the cell. They show that three-quarters of the human genome can be transcribed, and provide a wealth of information about the range and levels of expression, localization, processing fates and modifications of both known and previously unannotated RNAs. Collectively, these observations suggest that the current concept of a gene should be revisited.

The repertoire of mutational signatures in human cancer
Ludmil B. Alexandrov, Jaegil Kim, Nicholas J. Haradhvala, Mi Ni Huang +4 more
2020· Nature3.7Kdoi:10.1038/s41586-020-1943-3

Abstract Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature 1 . Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium 2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses 3–15 , enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated—but distinct—DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of individual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.

Fast and accurate de novo genome assembly from long uncorrected reads
Robert Vaser, Ivan Sović, Niranjan Nagarajan, Mile Šikić
2017· Genome Research3.4Kdoi:10.1101/gr.214270.116

The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction step can be omitted and that high-quality consensus sequences can be generated efficiently with a SIMD-accelerated, partial-order alignment-based, stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore data sets, we show that Racon coupled with miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster.

Pan-cancer analysis of whole genomes
Lauri A. Aaltonen, Federico Abascal, Adam Abeshouse, Hiroyuki Aburatani +4 more
2020· Nature3.3Kdoi:10.1038/s41586-020-1969-6

Abstract Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale 1–3 . Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter 4 ; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation 5,6 ; analyses timings and patterns of tumour evolution 7 ; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity 8,9 ; and evaluates a range of more-specialized features of cancer genomes 8,10–18 .

Normal gut microbiota modulates brain development and behavior
Rochellys Diaz Heijtz, Shugui Wang, Farhana Anuar, Yu Qian +4 more
2011· Proceedings of the National Academy of Sciences3.2Kdoi:10.1073/pnas.1010529108

Microbial colonization of mammals is an evolution-driven process that modulate host physiology, many of which are associated with immunity and nutrient intake. Here, we report that colonization by gut microbiota impacts mammalian brain development and subsequent adult behavior. Using measures of motor activity and anxiety-like behavior, we demonstrate that germ free (GF) mice display increased motor activity and reduced anxiety, compared with specific pathogen free (SPF) mice with a normal gut microbiota. This behavioral phenotype is associated with altered expression of genes known to be involved in second messenger pathways and synaptic long-term potentiation in brain regions implicated in motor control and anxiety-like behavior. GF mice exposed to gut microbiota early in life display similar characteristics as SPF mice, including reduced expression of PSD-95 and synaptophysin in the striatum. Hence, our results suggest that the microbial colonization process initiates signaling mechanisms that affect neuronal circuits involved in motor control and anxiety behavior.

Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals
James J. Lee, Robbee Wedow, Aysu Okbay, Edward Kong +4 more
2018· Nature Genetics2.8Kdoi:10.1038/s41588-018-0147-3

Here we conducted a large-scale genetic association analysis of educational attainment in a sample of approximately 1.1 million individuals and identify 1,271 independent genome-wide-significant SNPs. For the SNPs taken together, we found evidence of heterogeneous effects across environments. The SNPs implicate genes involved in brain-development processes and neuron-to-neuron communication. In a separate analysis of the X chromosome, we identify 10 independent genome-wide-significant SNPs and estimate a SNP heritability of around 0.3% in both men and women, consistent with partial dosage compensation. A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11-13% of the variance in educational attainment and 7-10% of the variance in cognitive performance. This prediction accuracy substantially increases the utility of polygenic scores as tools in research.

Mapping genomic loci implicates genes and synaptic biology in schizophrenia
Vassily Trubetskoy, Antonio F. Pardiñas, Ting Qi, Georgia Panagiotaropoulou +4 more
2022· Nature2.7Kdoi:10.1038/s41586-022-04434-5

, much of which is attributable to common risk alleles. Here, in a two-stage genome-wide association study of up to 76,755 individuals with schizophrenia and 243,649 control individuals, we report common variant associations at 287 distinct genomic loci. Associations were concentrated in genes that are expressed in excitatory and inhibitory neurons of the central nervous system, but not in other tissues or cell types. Using fine-mapping and functional genomic data, we identify 120 genes (106 protein-coding) that are likely to underpin associations at some of these loci, including 16 genes with credible causal non-synonymous or untranslated region variation. We also implicate fundamental processes related to neuronal function, including synaptic organization, differentiation and transmission. Fine-mapped candidates were enriched for genes associated with rare disruptive coding variants in people with schizophrenia, including the glutamate receptor subunit GRIN2A and transcription factor SP4, and were also enriched for genes implicated by such variants in neurodevelopmental disorders. We identify biological processes relevant to schizophrenia pathophysiology; show convergence of common and rare variant associations in schizophrenia and neurodevelopmental disorders; and provide a resource of prioritized genes and variants to advance mechanistic studies.

Comprehensive Characterization of Cancer Driver Genes and Mutations
Matthew H. Bailey, Collin Tokheim, Eduard Porta‐Pardo, Sohini Sengupta +4 more
2018· Cell2.5Kdoi:10.1016/j.cell.2018.02.060

Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.

SIFT web server: predicting effects of amino acid substitutions on proteins
Ngak-Leng Sim, P. Naresh Kumar, Jing Hu, Steven Henikoff +2 more
2012· Nucleic Acids Research2.5Kdoi:10.1093/nar/gks539

The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with predictions on their variants. Since its release, SIFT has become one of the standard tools for characterizing missense variation. We have updated SIFT's genome-wide prediction tool since our last publication in 2009, and added new features to the insertion/deletion (indel) tool. We also show accuracy metrics on independent data sets. The original developers have hosted the SIFT web server at FHCRC, JCVI and the web server is currently located at BII. The URL is http://sift-dna.org (24 May 2012, date last accessed).

LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets
Andreas Wilm, Pauline Aw, Denis Bertrand, Grace Hui Ting Yeo +4 more
2012· Nucleic Acids Research1.5Kdoi:10.1093/nar/gks918

The study of cell-population heterogeneity in a range of biological systems, from viruses to bacterial isolates to tumor samples, has been transformed by recent advances in sequencing throughput. While the high-coverage afforded can be used, in principle, to identify very rare variants in a population, existing ad hoc approaches frequently fail to distinguish true variants from sequencing errors. We report a method (LoFreq) that models sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population. Using simulated and real datasets (viral, bacterial and human), we show that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics. We also present experimental validation for LoFreq on two different platforms (Fluidigm and Sequenom) and its application to call rare somatic variants from exome sequencing datasets for gastric cancer. Source code and executables for LoFreq are freely available at http://sourceforge.net/projects/lofreq/.

Ceritinib in <i>ALK</i> -Rearranged Non–Small-Cell Lung Cancer
Alice T. Shaw, Dong-Wan Kim, Ranee Mehra, Daniel S.W. Tan +4 more
2014· New England Journal of Medicine1.5Kdoi:10.1056/nejmoa1311107

BACKGROUND: Non-small-cell lung cancer (NSCLC) harboring the anaplastic lymphoma kinase gene (ALK) rearrangement is sensitive to the ALK inhibitor crizotinib, but resistance invariably develops. Ceritinib (LDK378) is a new ALK inhibitor that has shown greater antitumor potency than crizotinib in preclinical studies. METHODS: In this phase 1 study, we administered oral ceritinib in doses of 50 to 750 mg once daily to patients with advanced cancers harboring genetic alterations in ALK. In an expansion phase of the study, patients received the maximum tolerated dose. Patients were assessed to determine the safety, pharmacokinetic properties, and antitumor activity of ceritinib. Tumor biopsies were performed before ceritinib treatment to identify resistance mutations in ALK in a group of patients with NSCLC who had had disease progression during treatment with crizotinib. RESULTS: A total of 59 patients were enrolled in the dose-escalation phase. The maximum tolerated dose of ceritinib was 750 mg once daily; dose-limiting toxic events included diarrhea, vomiting, dehydration, elevated aminotransferase levels, and hypophosphatemia. This phase was followed by an expansion phase, in which an additional 71 patients were treated, for a total of 130 patients overall. Among 114 patients with NSCLC who received at least 400 mg of ceritinib per day, the overall response rate was 58% (95% confidence interval [CI], 48 to 67). Among 80 patients who had received crizotinib previously, the response rate was 56% (95% CI, 45 to 67). Responses were observed in patients with various resistance mutations in ALK and in patients without detectable mutations. Among patients with NSCLC who received at least 400 mg of ceritinib per day, the median progression-free survival was 7.0 months (95% CI, 5.6 to 9.5). CONCLUSIONS: Ceritinib was highly active in patients with advanced, ALK-rearranged NSCLC, including those who had had disease progression during crizotinib treatment, regardless of the presence of resistance mutations in ALK. (Funded by Novartis Pharmaceuticals and others; ClinicalTrials.gov number, NCT01283516.).

An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival
Lance D. Miller, Johanna Smeds, Joshy George, Vinsensius B. Vega +4 more
2005· Proceedings of the National Academy of Sciences1.3Kdoi:10.1073/pnas.0506230102

Perturbations of the p53 pathway are associated with more aggressive and therapeutically refractory tumors. However, molecular assessment of p53 status, by using sequence analysis and immunohistochemistry, are incomplete assessors of p53 functional effects. We posited that the transcriptional fingerprint is a more definitive downstream indicator of p53 function. Herein, we analyzed transcript profiles of 251 p53-sequenced primary breast tumors and identified a clinically embedded 32-gene expression signature that distinguishes p53-mutant and wild-type tumors of different histologies and outperforms sequence-based assessments of p53 in predicting prognosis and therapeutic response. Moreover, the p53 signature identified a subset of aggressive tumors absent of sequence mutations in p53 yet exhibiting expression characteristics consistent with p53 deficiency because of attenuated p53 transcript levels. Our results show the primary importance of p53 functional status in predicting clinical breast cancer behavior.

Novel genetic associations for blood pressure identified via gene-alcohol interaction in up to 570K individuals across multiple ancestries
Mary F. Feitosa, Aldi T. Kraja, Daniel I. Chasman, Yun J. Sung +4 more
2018· PLoS ONE1.2Kdoi:10.1371/journal.pone.0198166

Heavy alcohol consumption is an established risk factor for hypertension; the mechanism by which alcohol consumption impact blood pressure (BP) regulation remains unknown. We hypothesized that a genome-wide association study accounting for gene-alcohol consumption interaction for BP might identify additional BP loci and contribute to the understanding of alcohol-related BP regulation. We conducted a large two-stage investigation incorporating joint testing of main genetic effects and single nucleotide variant (SNV)-alcohol consumption interactions. In Stage 1, genome-wide discovery meta-analyses in ≈131K individuals across several ancestry groups yielded 3,514 SNVs (245 loci) with suggestive evidence of association (P < 1.0 x 10-5). In Stage 2, these SNVs were tested for independent external replication in ≈440K individuals across multiple ancestries. We identified and replicated (at Bonferroni correction threshold) five novel BP loci (380 SNVs in 21 genes) and 49 previously reported BP loci (2,159 SNVs in 109 genes) in European ancestry, and in multi-ancestry meta-analyses (P < 5.0 x 10-8). For African ancestry samples, we detected 18 potentially novel BP loci (P < 5.0 x 10-8) in Stage 1 that warrant further replication. Additionally, correlated meta-analysis identified eight novel BP loci (11 genes). Several genes in these loci (e.g., PINX1, GATA4, BLK, FTO and GABBR2) have been previously reported to be associated with alcohol consumption. These findings provide insights into the role of alcohol consumption in the genetic architecture of hypertension.

Transcriptional Regulation of Nanog by OCT4 and SOX2
David J. Rodda, Joon-Lin Chew, Leng-Hiong Lim, Yuin‐Han Loh +3 more
2005· Journal of Biological Chemistry1.2Kdoi:10.1074/jbc.m502573200

Nanog, Sox2, and Oct4 are transcription factors all essential to maintaining the pluripotent embryonic stem cell phenotype. Through a cooperative interaction, Sox2 and Oct4 have previously been described to drive pluripotent-specific expression of a number of genes. We now extend the list of Sox2-Oct4 target genes to include Nanog. Within the Nanog proximal promoter, we identify a composite sox-oct cis-regulatory element essential for Nanog pluripotent transcription. This element is conserved over 250 million years of cumulative evolution within the eutherian mammals. A Nanog proximal promoter-EGFP (enhanced green fluorescent protein) reporter transgene recapitulates endogenous Nanog mRNA expression in embryonic stem cells and their differentiated derivatives. Sox2 and Oct4 interaction with the Nanog promoter was confirmed through mutagenesis and in vitro binding assays. Electrophoretic mobility shift assays indicate that the Sox2-Oct4 heterodimer forms more efficiently on the composite element within Nanog than the similar element within Fgf4. Using chromatin immunoprecipitation, we show that Oct4 and Sox2 bind to the Nanog promoter in living mouse and human embryonic stem cells. Furthermore, by specific knockdown of Oct4 and Sox2 mRNA by RNA interference in embryonic stem cells, we provide genetic evidence for a link between Oct4, Sox2, and the Nanog promoter. These studies extend the understanding of the pluripotent genetic regulatory network within which the Sox2-Oct4 complex are at the top of the regulatory hierarchy.

GISAID’s Role in Pandemic Response
Shruti Khare, GISAID Global Data Science Initiative (GISAID), Munich, Germany, Céline Gurry, Lucas Freitas +4 more
2021· China CDC Weekly1.1Kdoi:10.46234/ccdcw2021.255

GISAID is a global data science initiative and the primary source of genomic and associated metadata of all influenza viruses, Respiratory Syncytial Virus (RSV) and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pandemic coronavirus causing coronavirus disease 2019 (COVID-19). GISAID's publicly accessible data sharing platform enables collaboration of over 42,000 participating researchers from 198 nations and data generators from over 3,500 institutions across the globe. Since the first wholegenome sequences were made available by China CDC through GISAID on January 10, 2020, over 5 million genetic sequences of SARS-CoV-2 from 194 countries and territories have been made publicly available through GISAID's EpiCoV database as of November 9, 2021. This high-quality, curated data enabled the rapid development of diagnostic and prophylactic measures against SARS-CoV-2 including the first diagnostic tests and the first vaccines to combat COVID-19 as well as continuous monitoring of emerging variants in near real-time.

The evolutionary history of 2,658 cancers
Moritz Gerstung, Clemency Jolly, Ignaty Leshchiner, Stefan C. Dentro +4 more
2020· Nature1.1Kdoi:10.1038/s41586-019-1907-7

Abstract Cancer develops through a process of somatic evolution 1,2 . Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes 3 . Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) 4 , we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.

Genomic Loss of microRNA-101 Leads to Overexpression of Histone Methyltransferase EZH2 in Cancer
Sooryanarayana Varambally, Qi Cao, Ram S. Mani, Sunita Shankar +4 more
2008· Science1.1Kdoi:10.1126/science.1165395

Enhancer of zeste homolog 2 (EZH2) is a mammalian histone methyltransferase that contributes to the epigenetic silencing of target genes and regulates the survival and metastasis of cancer cells. EZH2 is overexpressed in aggressive solid tumors by mechanisms that remain unclear. Here we show that the expression and function of EZH2 in cancer cell lines are inhibited by microRNA-101 (miR-101). Analysis of human prostate tumors revealed that miR-101 expression decreases during cancer progression, paralleling an increase in EZH2 expression. One or both of the two genomic loci encoding miR-101 were somatically lost in 37.5% of clinically localized prostate cancer cells (6 of 16) and 66.7% of metastatic disease cells (22 of 33). We propose that the genomic loss of miR-101 in cancer leads to overexpression of EZH2 and concomitant dysregulation of epigenetic pathways, resulting in cancer progression.

The draft genome of sweet orange (Citrus sinensis)
Qiang Xu, Ling-Ling Chen, Xiaoan Ruan, Dijun Chen +4 more
2012· Nature Genetics1.0Kdoi:10.1038/ng.2472

Oranges are an important nutritional source for human health and have immense economic value. Here we present a comprehensive analysis of the draft genome of sweet orange (Citrus sinensis). The assembled sequence covers 87.3% of the estimated orange genome, which is relatively compact, as 20% is composed of repetitive elements. We predicted 29,445 protein-coding genes, half of which are in the heterozygous state. With additional sequencing of two more citrus species and comparative analyses of seven citrus genomes, we present evidence to suggest that sweet orange originated from a backcross hybrid between pummelo and mandarin. Focused analysis on genes involved in vitamin C metabolism showed that GalUR, encoding the rate-limiting enzyme of the galacturonate pathway, is significantly upregulated in orange fruit, and the recent expansion of this gene family may provide a genomic basis. This draft genome represents a valuable resource for understanding and improving many important citrus traits in the future.