
Johns Hopkins Medicine
Hospital / health systemBaltimore, Maryland, United States
Research output, citation impact, and the most-cited recent papers from Johns Hopkins Medicine (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Johns Hopkins Medicine
Human mesenchymal stem cells are thought to be multipotent cells, which are present in adult marrow, that can replicate as undifferentiated cells and that have the potential to differentiate to lineages of mesenchymal tissues, including bone, cartilage, fat, tendon, muscle, and marrow stroma. Cells that have the characteristics of human mesenchymal stem cells were isolated from marrow aspirates of volunteer donors. These cells displayed a stable phenotype and remained as a monolayer in vitro. These adult stem cells could be induced to differentiate exclusively into the adipocytic, chondrocytic, or osteocytic lineages. Individual stem cells were identified that, when expanded to colonies, retained their multilineage potential.
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. Results for the final phase of the 1000 Genomes Project are presented including whole-genome sequencing, targeted exome sequencing, and genotyping on high-density SNP arrays for 2,504 individuals across 26 populations, providing a global reference data set to support biomedical genetics. The 1000 Genomes Project has sought to comprehensively catalogue human genetic variation across populations, providing a valuable public genomic resource. The data obtained so far have found applications ranging from association studies and fine mapping studies to the filtering of likely neutral variants in rare-disease cohorts. The authors now report on the final phase of the project, phase 3, which covers previously uncharacterized areas of human genetic diversity in terms of the populations sampled and categories of characterized variation. The sample now includes more than 2,500 individuals from 26 global populations, with low coverage whole-genome and deep exome sequencing, as well as dense microarray genotyping. They find that while most common variants are shared across populations, rarer variants are often restricted to closely related populations. The authors also demonstrate the use of the phase 3 dataset as a reference panel for imputation to improve the resolution in genetic association studies.
The National Institute on Aging and the Alzheimer's Association charged a workgroup with the task of revising the 1984 criteria for Alzheimer's disease (AD) dementia. The workgroup sought to ensure that the revised criteria would be flexible enough to be used by both general healthcare providers without access to neuropsychological testing, advanced imaging, and cerebrospinal fluid measures, and specialized investigators involved in research or in clinical trial studies who would have these tools available. We present criteria for all-cause dementia and for AD dementia. We retained the general framework of probable AD dementia from the 1984 criteria. On the basis of the past 27 years of experience, we made several changes in the clinical criteria for the diagnosis. We also retained the term possible AD dementia, but redefined it in a manner more focused than before. Biomarker evidence was also integrated into the diagnostic formulations for probable and possible AD dementia for use in research settings. The core clinical criteria for AD dementia will continue to be the cornerstone of the diagnosis in clinical practice, but biomarker evidence is expected to enhance the pathophysiological specificity of the diagnosis of AD dementia. Much work lies ahead for validating the biomarker diagnosis of AD dementia.
MOTIVATION: Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome. RESULTS: We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds. AVAILABILITY AND IMPLEMENTATION: The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash. CONTACT: t.magoc@gmail.com.
TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome. In addition to de novo spliced alignment, TopHat2 can align reads across fusion breaks, which can occur after genomic translocations. TopHat2 combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes. TopHat2 is available at http://ccb.jhu.edu/software/tophat.
Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome. The Human Microbiome Project Consortium reports the first results of their analysis of microbial communities from distinct, clinically relevant body habitats in a human cohort; the insights into the microbial communities of a healthy population lay foundations for future exploration of the epidemiology, ecology and translational applications of the human microbiome. The Human Microbiome Project (HMP), supported by the National Institutes of Health Common Fund, has the goal of characterizing the microbial communities that inhabit and interact with the human body in sickness and in health. In two Articles in this issue of Nature, the HMP Consortium presents the first population-scale details of the organismal and functional composition of the microbiota across five areas of the body. An associated News & Views discusses the initial results — which, along with those of a series of co-publications, already constitute the most extensive catalogue of organisms and genes related to the human microbiome yet published — and highlights some of the major questions that the project will tackle in the next few years.
The last decade has seen a sharp increase in the number of scientific publications describing physiological and pathological functions of extracellular vesicles (EVs), a collective term covering various subtypes of cell-released, membranous structures, called exosomes, microvesicles, microparticles, ectosomes, oncosomes, apoptotic bodies, and many other names. However, specific issues arise when working with these entities, whose size and amount often make them difficult to obtain as relatively pure preparations, and to characterize properly. The International Society for Extracellular Vesicles (ISEV) proposed Minimal Information for Studies of Extracellular Vesicles ("MISEV") guidelines for the field in 2014. We now update these "MISEV2014" guidelines based on evolution of the collective knowledge in the last four years. An important point to consider is that ascribing a specific function to EVs in general, or to subtypes of EVs, requires reporting of specific information beyond mere description of function in a crude, potentially contaminated, and heterogeneous preparation. For example, claims that exosomes are endowed with exquisite and specific activities remain difficult to support experimentally, given our still limited knowledge of their specific molecular machineries of biogenesis and release, as compared with other biophysically similar EVs. The MISEV2018 guidelines include tables and outlines of suggested protocols and steps to follow to document specific EV-associated functional activities. Finally, a checklist is provided with summaries of key points.
The National Institute on Aging and the Alzheimer's Association charged a workgroup with the task of developing criteria for the symptomatic predementia phase of Alzheimer's disease (AD), referred to in this article as mild cognitive impairment due to AD. The workgroup developed the following two sets of criteria: (1) core clinical criteria that could be used by healthcare providers without access to advanced imaging techniques or cerebrospinal fluid analysis, and (2) research criteria that could be used in clinical research settings, including clinical trials. The second set of criteria incorporate the use of biomarkers based on imaging and cerebrospinal fluid measures. The final set of criteria for mild cognitive impairment due to AD has four levels of certainty, depending on the presence and nature of the biomarker findings. Considerable work is needed to validate the criteria that use biomarkers and to standardize biomarker analysis for use in community settings.
Abstract Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes 1 . Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
BACKGROUND: Somatic mutations have the potential to encode "non-self" immunogenic antigens. We hypothesized that tumors with a large number of somatic mutations due to mismatch-repair defects may be susceptible to immune checkpoint blockade. METHODS: We conducted a phase 2 study to evaluate the clinical activity of pembrolizumab, an anti-programmed death 1 immune checkpoint inhibitor, in 41 patients with progressive metastatic carcinoma with or without mismatch-repair deficiency. Pembrolizumab was administered intravenously at a dose of 10 mg per kilogram of body weight every 14 days in patients with mismatch repair-deficient colorectal cancers, patients with mismatch repair-proficient colorectal cancers, and patients with mismatch repair-deficient cancers that were not colorectal. The coprimary end points were the immune-related objective response rate and the 20-week immune-related progression-free survival rate. RESULTS: The immune-related objective response rate and immune-related progression-free survival rate were 40% (4 of 10 patients) and 78% (7 of 9 patients), respectively, for mismatch repair-deficient colorectal cancers and 0% (0 of 18 patients) and 11% (2 of 18 patients) for mismatch repair-proficient colorectal cancers. The median progression-free survival and overall survival were not reached in the cohort with mismatch repair-deficient colorectal cancer but were 2.2 and 5.0 months, respectively, in the cohort with mismatch repair-proficient colorectal cancer (hazard ratio for disease progression or death, 0.10 [P<0.001], and hazard ratio for death, 0.22 [P=0.05]). Patients with mismatch repair-deficient noncolorectal cancer had responses similar to those of patients with mismatch repair-deficient colorectal cancer (immune-related objective response rate, 71% [5 of 7 patients]; immune-related progression-free survival rate, 67% [4 of 6 patients]). Whole-exome sequencing revealed a mean of 1782 somatic mutations per tumor in mismatch repair-deficient tumors, as compared with 73 in mismatch repair-proficient tumors (P=0.007), and high somatic mutation loads were associated with prolonged progression-free survival (P=0.02). CONCLUSIONS: This study showed that mismatch-repair status predicted clinical benefit of immune checkpoint blockade with pembrolizumab. (Funded by Johns Hopkins University and others; ClinicalTrials.gov number, NCT01876511.).
A series of yeast shuttle vectors and host strains has been created to allow more efficient manipulation of DNA in Saccharomyces cerevisiae. Transplacement vectors were constructed and used to derive yeast strains containing nonreverting his3, trp1, leu2 and ura3 mutations. A set of YCp and YIp vectors (pRS series) was then made based on the backbone of the multipurpose plasmid pBLUESCRIPT. These pRS vectors are all uniform in structure and differ only in the yeast selectable marker gene used (HIS3, TRP1, LEU2 and URA3). They possess all of the attributes of pBLUESCRIPT and several yeast-specific features as well. Using a pRS vector, one can perform most standard DNA manipulations in the same plasmid that is introduced into yeast.
Information about the basal ganglia has accumulated at a prodigious pace over the past decade, necessitating major revisions in our concepts of the structural and functional organization of these nuclei. From earlier data it had appeared that the basal ganglia served primarily to integrate diverse inputs from the entire cerebral cortex and to funnel these influences, via the ventrolateral thalamus, to the motor cortex (Allen & Tsukahara 1974, Evarts & Thach 1969, Kemp & Powell 1971). In particular, the basal
By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. This report from the 1000 Genomes Project describes the genomes of 1,092 individuals from 14 human populations, providing a resource for common and low-frequency variant analysis in individuals from diverse populations; hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites, can be found in each individual. This report by the 1000 Genomes Project describes the genomes of 1,092 individuals from 14 human populations, providing a resource for common and low-frequency variant analysis in individuals from diverse populations. Integrative analyses reveal profiles of rare and common variants in different populations. The frequencies of rare variants vary across biological pathways, and hundreds of rare, non-coding variants at conserved sites — such as changes disrupting transcription-factor motifs — can be established for each individual.
A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients’ lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology. The Cancer Genome Atlas (TCGA) project reports here its analysis of messenger RNA and microRNA expression, promoter methylation, DNA copy number and exome sequences in 489 high-grade serous ovarian adenocarcinomas. The analyses help establish new tumour subtypes. Among other insights is the finding that while the gene encoding p53 tumour suppressor is mutated in almost all tumours, nine other loci including NF1, BRCA1, BRCA2, RB1 and CDK12 carry recurrent albeit low-prevalence mutations. Homologous recombination is defective in about half of the tumours studied, and Notch and FOXM1 signalling are involved in the pathophysiology.
Mutations in the evolutionarily conserved codons of the p53 tumor suppressor gene are common in diverse types of human cancer. The p53 mutational spectrum differs among cancers of the colon, lung, esophagus, breast, liver, brain, reticuloendothelial tissues, and hemopoietic tissues. Analysis of these mutations can provide clues to the etiology of these diverse tumors and to the function of specific regions of p53. Transitions predominate in colon, brain, and lymphoid malignancies, whereas G:C to T:A transversions are the most frequent substitutions observed in cancers of the lung and liver. Mutations at A:T base pairs are seen more frequently in esophageal carcinomas than in other solid tumors. Most transitions in colorectal carcinomas, brain tumors, leukemias, and lymphomas are at CpG dinucleotide mutational hot spots. G to T transversions in lung, breast, and esophageal carcinomas are dispersed among numerous codons. In liver tumors in persons from geographic areas in which both aflatoxin B1 and hepatitis B virus are cancer risk factors, most mutations are at one nucleotide pair of codon 249. These differences may reflect the etiological contributions of both exogenous and endogenous factors to human carcinogenesis.
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother–father–child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10−8 per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. This issue of Nature contains the first publication from The 1000 Genomes Project, an international collaboration that will produce an extensive public catalogue of human genetic variation. The plan, in fact, is to sequence about 2,000 unidentified individuals from 20 populations around the world. This first paper presents the results from the project's pilot phase, testing three different strategies for genome-wide sequencing with high-throughput platforms: low-coverage whole-genome sequencing of 179 individuals in three population groups, high-coverage sequencing of two mother–father–child trios, and exon-targeted sequencing of 697 individuals from seven populations. The goal of the 1000 Genomes Project is to provide in-depth information on variation in human genome sequences. In the pilot phase reported here, different strategies for genome-wide sequencing, using high-throughput sequencing platforms, were developed and compared. The resulting data set includes more than 95% of the currently accessible variants found in any individual, and can be used to inform association and functional studies.
BACKGROUND: Childhood obesity increases the risk of obesity in adulthood, but how parental obesity affects the chances of a child's becoming an obese adult is unknown. We investigated the risk of obesity in young adulthood associated with both obesity in childhood and obesity in one or both parents. METHODS: Height and weight measurements were abstracted from the records of 854 subjects born at a health maintenance organization in Washington State between 1965 and 1971. Their parents' medical records were also reviewed. Childhood obesity was defined as a body-mass index at or above the 85th percentile for age and sex, and obesity in adulthood as a mean body-mass index at or above 27.8 for men and 27.3 for women. RESULTS: In young adulthood (defined as 21 to 29 years of age), 135 subjects (16 percent) were obese. Among those who were obese during childhood, the chance of obesity in adulthood ranged from 8 percent for 1- or 2-year-olds without obese parents to 79 percent for 10-to-14-year-olds with at least one obese parent. After adjustment for parental obesity, the odds ratios for obesity in adulthood associated with childhood obesity ranged from 1.3 (95 percent confidence interval, 0.6 to 3.0) for obesity at 1 or 2 years of age to 17.5 (7.7 to 39.5) for obesity at 15 to 17 years of age. After adjustment for the child's obesity status, the odds ratios for obesity in adulthood associated with having one obese parent ranged from 2.2 (95 percent confidence interval, 1.1 to 4.3) at 15 to 17 years of age to 3.2 (1.8 to 5.7) at 1 or 2 years of age. CONCLUSIONS: Obese children under three years of age without obese parents are at low risk for obesity in adulthood, but among older children, obesity is an increasingly important predictor of adult obesity, regardless of whether the parents are obese. Parental obesity more than doubles the risk of adult obesity among both obese and nonobese children under 10 years of age.
Although Kraken's k-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold. Kraken 2 also introduces a translated search mode, providing increased sensitivity in viral metagenomics analysis.
BACKGROUND: Antiretroviral therapy that reduces viral replication could limit the transmission of human immunodeficiency virus type 1 (HIV-1) in serodiscordant couples. METHODS: In nine countries, we enrolled 1763 couples in which one partner was HIV-1-positive and the other was HIV-1-negative; 54% of the subjects were from Africa, and 50% of infected partners were men. HIV-1-infected subjects with CD4 counts between 350 and 550 cells per cubic millimeter were randomly assigned in a 1:1 ratio to receive antiretroviral therapy either immediately (early therapy) or after a decline in the CD4 count or the onset of HIV-1-related symptoms (delayed therapy). The primary prevention end point was linked HIV-1 transmission in HIV-1-negative partners. The primary clinical end point was the earliest occurrence of pulmonary tuberculosis, severe bacterial infection, a World Health Organization stage 4 event, or death. RESULTS: As of February 21, 2011, a total of 39 HIV-1 transmissions were observed (incidence rate, 1.2 per 100 person-years; 95% confidence interval [CI], 0.9 to 1.7); of these, 28 were virologically linked to the infected partner (incidence rate, 0.9 per 100 person-years, 95% CI, 0.6 to 1.3). Of the 28 linked transmissions, only 1 occurred in the early-therapy group (hazard ratio, 0.04; 95% CI, 0.01 to 0.27; P<0.001). Subjects receiving early therapy had fewer treatment end points (hazard ratio, 0.59; 95% CI, 0.40 to 0.88; P=0.01). CONCLUSIONS: The early initiation of antiretroviral therapy reduced rates of sexual transmission of HIV-1 and clinical events, indicating both personal and public health benefits from such therapy. (Funded by the National Institute of Allergy and Infectious Diseases and others; HPTN 052 ClinicalTrials.gov number, NCT00074581.).
Because most colorectal carcinomas appear to arise from adenomas, studies of different stages of colorectal neoplasia may shed light on the genetic alterations involved in tumor progression. We looked for four genetic alterations (ras-gene mutations and allelic deletions of chromosomes 5, 17, and 18) in 172 colorectal-tumor specimens representing various stages of neoplastic development. The specimens consisted of 40 predominantly early-stage adenomas from 7 patients with familial adenomatous polyposis, 40 adenomas (19 without associated foci of carcinoma and 21 with such foci) from 33 patients without familial polyposis, and 92 carcinomas resected from 89 patients. We found that ras-gene mutations occurred in 58 percent of adenomas larger than 1 cm and in 47 percent of carcinomas. However, ras mutations were found in only 9 percent of adenomas under 1 cm in size. Sequences on chromosome 5 that are linked to the gene for familial adenomatous polyposis were not lost in adenomas from the patients with polyposis but were lost in 29 to 35 percent of adenomas and carcinomas, respectively, from other patients. A specific region of chromosome 18 was deleted frequently in carcinomas (73 percent) and in advanced adenomas (47 percent) but only occasionally in earlier-stage adenomas (11 to 13 percent). Chromosome 17p sequences were usually lost only in carcinomas (75 percent). The four molecular alterations accumulated in a fashion that paralleled the clinical progression of tumors. These results are consistent with a model of colorectal tumorigenesis in which the steps required for the development of cancer often involve the mutational activation of an oncogene coupled with the loss of several genes that normally suppress tumorigenesis.