NobleBlocks
University of California San Diego logo

University of California San Diego

UniversitySan Diego, California, United States

Research output, citation impact, and the most-cited recent papers from University of California San Diego (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
350.5K
Citations
49.3M
h-index
1723
i10-index
453.2K
Also known as
UC San DiegoUniversity of California San DiegoUniversity of California, San Diego

Top-cited papers from University of California San Diego

Self-Consistent Equations Including Exchange and Correlation Effects
W. Kohn, L. J. Sham
1965· Physical Review62.9Kdoi:10.1103/physrev.140.a1133

From a theory of Hohenberg and Kohn, approximation methods for treating an inhomogeneous system of interacting electrons are developed. These methods are exact for systems of slowly varying or high density. For the ground state, they lead to self-consistent equations analogous to the Hartree and Hartree-Fock equations, respectively. In these equations the exchange and correlation portions of the chemical potential of a uniform electron gas appear as additional effective potentials. (The exchange portion of our effective potential differs from that due to Slater by a factor of $\frac{2}{3}$.) Electronic systems at finite temperatures and in magnetic fields are also treated by similar methods. An appendix deals with a further correction for systems with short-wavelength density oscillations.

Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks
Paul Shannon, Andrew Markiel, Owen Ozier, Nitin S. Baliga +4 more
2003· Genome Research53.9Kdoi:10.1101/gr.1239303

Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

Co-Integration and Error Correction: Representation, Estimation, and Testing
Robert F. Engle, C. W. J. Granger
1987· Econometrica31.9Kdoi:10.2307/1913236

The relationship between co-integration and error correction models, first suggested in Granger (1981), is here extended and used to develop estimation procedures, tests, and empirical examples. If each element of a vector of time series x first achieves stationarity after differencing, but a linear combination a'x is already stationary, the time series x are said to be co-integrated with co-integrating vector a. There may be several such co-integrating vectors so that a becomes a matrix. Interpreting a'x,= 0 as a long run equilibrium, co-integration implies that deviations from equilibrium are stationary, with finite variance, even though the series themselves are nonstationary and have infinite variance. The paper presents a representation theorem based on Granger (1983), which connects the moving average, autoregressive, and error correction representations for co-integrated systems. A vector autoregression in differenced variables is incompatible with these representations. Estimation of these models is discussed and a simple but asymptotically efficient two-step estimator is proposed. Testing for co-integration combines the problems of unit root tests and tests with parameters unidentified under the null. Seven statistics are formulated and analyzed. The critical values of these statistics are calculated based on a Monte Carlo simulation. Using these critical values, the power properties of the tests are examined and one test procedure is recommended for application. In a series of examples it is found that consumption and income are co-integrated, wages and prices are not, short and long interest rates are, and nominal GNP is co-integrated with M2, but not M1, M3, or aggregate liquid assets.

MrBayes 3: Bayesian phylogenetic inference under mixed models
Fredrik Ronquist, John P. Huelsenbeck
2003· Bioinformatics29.3Kdoi:10.1093/bioinformatics/btg180

Abstract Summary: MrBayes 3 performs Bayesian phylogenetic analysis combining information from different data partitions or subsets evolving under different stochastic evolutionary models. This allows the user to analyze heterogeneous data sets consisting of different data types—e.g. morphological, nucleotide, and protein—and to explore a wide variety of structured models mixing partition-unique and shared parameters. The program employs MPI to parallelize Metropolis coupling on Macintosh or UNIX clusters. Availability: http://morphbank.ebc.uu.se/mrbayes Contact: fredrik.ronquist@ebc.uu.se * To whom correspondence should be addressed.

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey Gurevich +4 more
2012· Journal of Computational Biology26.9Kdoi:10.1089/cmb.2012.0021

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility
Garrett M. Morris, Ruth Huey, William Lindstrom, Michel F. Sanner +3 more
2009· Journal of Computational Chemistry24.6Kdoi:10.1002/jcc.21256

We describe the testing and release of AutoDock4 and the accompanying graphical user interface AutoDockTools. AutoDock4 incorporates limited flexibility in the receptor. Several tests are reported here, including a redocking experiment with 188 diverse ligand-protein complexes and a cross-docking experiment using flexible sidechains in 87 HIV protease complexes. We also report its utility in analysis of covalently bound ligands, using both a grid-based docking method and a modification of the flexible sidechain technique.

Investigating Causal Relations by Econometric Models and Cross-spectral Methods
Clive W. J. Granger
1969· Econometrica22.8Kdoi:10.2307/1912791

There occurs on some occasions a difficulty in deciding the direction of causality between two related variables and also whether or not feedback is occurring. Testable definitions of causality and feedback are proposed and illustrated by use of simple two-variable models. The important problem of apparent instantaneous causality is discussed and it is suggested that the problem often arises due to slowness in recording information or because a sufficiently wide class of possible causal variables has not been used. It can be shown that the cross spectrum between two variables can be decomposed into two parts, each relating to a single causal arm of a feedback situation. Measures of causal lag and causal strength can then be constructed. A generalisation of this result with the partial cross spectrum is suggested.

A global reference for human genetic variation
Corresponding authors, Adam Auton, Gonçalo R. Abecasis, David M. Altshuler +4 more
2015· Nature19.8Kdoi:10.1038/nature15393

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. Results for the final phase of the 1000 Genomes Project are presented including whole-genome sequencing, targeted exome sequencing, and genotyping on high-density SNP arrays for 2,504 individuals across 26 populations, providing a global reference data set to support biomedical genetics. The 1000 Genomes Project has sought to comprehensively catalogue human genetic variation across populations, providing a valuable public genomic resource. The data obtained so far have found applications ranging from association studies and fine mapping studies to the filtering of likely neutral variants in rare-disease cohorts. The authors now report on the final phase of the project, phase 3, which covers previously uncharacterized areas of human genetic diversity in terms of the populations sampled and categories of characterized variation. The sample now includes more than 2,500 individuals from 26 global populations, with low coverage whole-genome and deep exome sequencing, as well as dense microarray genotyping. They find that while most common variants are shared across populations, rarer variants are often restricted to closely related populations. The authors also demonstrate the use of the phase 3 dataset as a reference panel for imputation to improve the resolution in genetic association studies.

The FAIR Guiding Principles for scientific data management and stewardship
Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton +4 more
2016· Scientific Data17.4Kdoi:10.1038/sdata.2016.18

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

Metascape provides a biologist-oriented resource for the analysis of systems-level datasets
Yingyao Zhou, Bin Zhou, Lars Pache, Max W. Chang +4 more
2019· Nature Communications15.5Kdoi:10.1038/s41467-019-09234-6

A critical component in the interpretation of systems-level studies is the inference of enriched biological pathways and protein complexes contained within OMICs datasets. Successful analysis requires the integration of a broad set of current biological databases and the application of a robust analytical pipeline to produce readily interpretable results. Metascape is a web-based portal designed to provide a comprehensive gene list annotation and analysis resource for experimental biologists. In terms of design features, Metascape combines functional enrichment, interactome analysis, gene annotation, and membership search to leverage over 40 independent knowledgebases within one integrated portal. Additionally, it facilitates comparative analyses of datasets across multiple independent and orthogonal experiments. Metascape provides a significantly simplified user experience through a one-click Express Analysis interface to generate interpretable outputs. Taken together, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era.

<i>Planck</i> 2018 results
N. Aghanim, Y. Akrami, M. Ashdown, J. Aumont +4 more
2020· Astronomy and Astrophysics13.8Kdoi:10.1051/0004-6361/201833910

We present cosmological parameter results from the final full-mission Planck measurements of the cosmic microwave background (CMB) anisotropies, combining information from the temperature and polarization maps and the lensing reconstruction. Compared to the 2015 results, improved measurements of large-scale polarization allow the reionization optical depth to be measured with higher precision, leading to significant gains in the precision of other correlated parameters. Improved modelling of the small-scale polarization leads to more robust constraints on many parameters, with residual modelling uncertainties estimated to affect them only at the 0.5 σ level. We find good consistency with the standard spatially-flat 6-parameter ΛCDM cosmology having a power-law spectrum of adiabatic scalar perturbations (denoted “base ΛCDM” in this paper), from polarization, temperature, and lensing, separately and in combination. A combined analysis gives dark matter density Ω c h 2 = 0.120 ± 0.001, baryon density Ω b h 2 = 0.0224 ± 0.0001, scalar spectral index n s = 0.965 ± 0.004, and optical depth τ = 0.054 ± 0.007 (in this abstract we quote 68% confidence regions on measured parameters and 95% on upper limits). The angular acoustic scale is measured to 0.03% precision, with 100 θ * = 1.0411 ± 0.0003. These results are only weakly dependent on the cosmological model and remain stable, with somewhat increased errors, in many commonly considered extensions. Assuming the base-ΛCDM cosmology, the inferred (model-dependent) late-Universe parameters are: Hubble constant H 0 = (67.4 ± 0.5) km s −1 Mpc −1 ; matter density parameter Ω m = 0.315 ± 0.007; and matter fluctuation amplitude σ 8 = 0.811 ± 0.006. We find no compelling evidence for extensions to the base-ΛCDM model. Combining with baryon acoustic oscillation (BAO) measurements (and considering single-parameter extensions) we constrain the effective extra relativistic degrees of freedom to be N eff = 2.99 ± 0.17, in agreement with the Standard Model prediction N eff = 3.046, and find that the neutrino mass is tightly constrained to ∑ m ν &lt; 0.12 eV. The CMB spectra continue to prefer higher lensing amplitudes than predicted in base ΛCDM at over 2 σ , which pulls some parameters that affect the lensing amplitude away from the ΛCDM model; however, this is not supported by the lensing reconstruction or (in models that also change the background geometry) BAO data. The joint constraint with BAO measurements on spatial curvature is consistent with a flat universe, Ω K = 0.001 ± 0.002. Also combining with Type Ia supernovae (SNe), the dark-energy equation of state parameter is measured to be w 0 = −1.03 ± 0.03, consistent with a cosmological constant. We find no evidence for deviations from a purely power-law primordial spectrum, and combining with data from BAO, BICEP2, and Keck Array data, we place a limit on the tensor-to-scalar ratio r 0.002 &lt; 0.06. Standard big-bang nucleosynthesis predictions for the helium and deuterium abundances for the base-ΛCDM cosmology are in excellent agreement with observations. The Planck base-ΛCDM results are in good agreement with BAO, SNe, and some galaxy lensing observations, but in slight tension with the Dark Energy Survey’s combined-probe results including galaxy clustering (which prefers lower fluctuation amplitudes or matter density parameters), and in significant, 3.6 σ , tension with local measurements of the Hubble constant (which prefer a higher value). Simple model extensions that can partially resolve these tensions are not favoured by the Planck data.

CD-HIT: accelerated for clustering the next-generation sequencing data
LiMin Fu, Beifang Niu, Zhengwei Zhu, Sitao Wu +1 more
2012· Bioinformatics11.7Kdoi:10.1093/bioinformatics/bts565

SUMMARY: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ∼24 cores and a quasi-linear speedup for up to ∼8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. AVAILABILITY: http://cd-hit.org. CONTACT: liwz@sdsc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

An index to quantify an individual's scientific research output
J. E. Hirsch
2005· Proceedings of the National Academy of Sciences11.5Kdoi:10.1073/pnas.0507655102

I propose the index h, defined as the number of papers with citation number > or =h, as a useful index to characterize the scientific output of a researcher.

QUAST: quality assessment tool for genome assemblies
Alexey Gurevich, Vladislav Saveliev, Nikolay Vyahhi, Glenn Tesler
2013· Bioinformatics11.3Kdoi:10.1093/bioinformatics/btt086

SUMMARY: Limitations of genome sequencing techniques have led to dozens of assembly algorithms, none of which is perfect. A number of methods for comparing assemblers have been developed, but none is yet a recognized benchmark. Further, most existing methods for comparing assemblies are only applicable to new assemblies of finished genomes; the problem of evaluating assemblies of previously unsequenced species has not been adequately considered. Here, we present QUAST-a quality assessment tool for evaluating and comparing genome assemblies. This tool improves on leading assembly comparison software with new ideas and quality metrics. QUAST can evaluate assemblies both with a reference genome, as well as without a reference. QUAST produces many reports, summary tables and plots to help scientists in their research and in their publications. In this study, we used QUAST to compare several genome assemblers on three datasets. QUAST tables and plots for all of them are available in the Supplementary Material, and interactive versions of these reports are on the QUAST website. AVAILABILITY: http://bioinf.spbau.ru/quast . SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Minimal information for studies of extracellular vesicles 2018 (MISEV2018): a position statement of the International Society for Extracellular Vesicles and update of the MISEV2014 guidelines
Clotilde Théry, Kenneth W. Witwer, Elena Aïkawa, María José Alcaraz +4 more
2018· Journal of Extracellular Vesicles11.0Kdoi:10.1080/20013078.2018.1535750

The last decade has seen a sharp increase in the number of scientific publications describing physiological and pathological functions of extracellular vesicles (EVs), a collective term covering various subtypes of cell-released, membranous structures, called exosomes, microvesicles, microparticles, ectosomes, oncosomes, apoptotic bodies, and many other names. However, specific issues arise when working with these entities, whose size and amount often make them difficult to obtain as relatively pure preparations, and to characterize properly. The International Society for Extracellular Vesicles (ISEV) proposed Minimal Information for Studies of Extracellular Vesicles ("MISEV") guidelines for the field in 2014. We now update these "MISEV2014" guidelines based on evolution of the collective knowledge in the last four years. An important point to consider is that ascribing a specific function to EVs in general, or to subtypes of EVs, requires reporting of specific information beyond mere description of function in a crude, potentially contaminated, and heterogeneous preparation. For example, claims that exosomes are endowed with exquisite and specific activities remain difficult to support experimentally, given our still limited knowledge of their specific molecular machineries of biogenesis and release, as compared with other biophysically similar EVs. The MISEV2018 guidelines include tables and outlines of suggested protocols and steps to follow to document specific EV-associated functional activities. Finally, a checklist is provided with summaries of key points.

Global Burden of Cardiovascular Diseases and Risk Factors, 1990–2019
Gregory A. Roth, George A. Mensah, Catherine O. Johnson, Giovanni Addolorato +4 more
2020· Journal of the American College of Cardiology10.9Kdoi:10.1016/j.jacc.2020.11.010

Cardiovascular diseases (CVDs), principally ischemic heart disease (IHD) and stroke, are the leading cause of global mortality and a major contributor to disability. This paper reviews the magnitude of total CVD burden, including 13 underlying causes of cardiovascular death and 9 related risk factors, using estimates from the Global Burden of Disease (GBD) Study 2019. GBD, an ongoing multinational collaboration to provide comparable and consistent estimates of population health over time, used all available population-level data sources on incidence, prevalence, case fatality, mortality, and health risks to produce estimates for 204 countries and territories from 1990 to 2019. Prevalent cases of total CVD nearly doubled from 271 million (95% uncertainty interval [UI]: 257 to 285 million) in 1990 to 523 million (95% UI: 497 to 550 million) in 2019, and the number of CVD deaths steadily increased from 12.1 million (95% UI:11.4 to 12.6 million) in 1990, reaching 18.6 million (95% UI: 17.1 to 19.7 million) in 2019. The global trends for disability-adjusted life years (DALYs) and years of life lost also increased significantly, and years lived with disability doubled from 17.7 million (95% UI: 12.9 to 22.5 million) to 34.4 million (95% UI:24.9 to 43.6 million) over that period. The total number of DALYs due to IHD has risen steadily since 1990, reaching 182 million (95% UI: 170 to 194 million) DALYs, 9.14 million (95% UI: 8.40 to 9.74 million) deaths in the year 2019, and 197 million (95% UI: 178 to 220 million) prevalent cases of IHD in 2019. The total number of DALYs due to stroke has risen steadily since 1990, reaching 143 million (95% UI: 133 to 153 million) DALYs, 6.55 million (95% UI: 6.00 to 7.02 million) deaths in the year 2019, and 101 million (95% UI: 93.2 to 111 million) prevalent cases of stroke in 2019. Cardiovascular diseases remain the leading cause of disease burden in the world. CVD burden continues its decades-long rise for almost all countries outside high-income countries, and alarmingly, the age-standardized rate of CVD has begun to rise in some locations where it was previously declining in high-income countries. There is an urgent need to focus on implementing existing cost-effective policies and interventions if the world is to meet the targets for Sustainable Development Goal 3 and achieve a 30% reduction in premature mortality due to noncommunicable diseases.

Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function
Garrett M. Morris, David S. Goodsell, Robert S. Halliday, Ruth Huey +3 more
1998· Journal of Computational Chemistry10.8Kdoi:10.1002/(sici)1096-987x(19981115)19:14<1639::aid-jcc10>3.0.co;2-b

A novel and robust automated docking method that predicts the bound conformations of flexible ligands to macromolecular targets has been developed and tested, in combination with a new scoring function that estimates the free energy change upon binding. Interestingly, this method applies a Lamarckian model of genetics, in which environmental adaptations of an individual's phenotype are reverse transcribed into its genotype and become heritable traits (sic). We consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein–ligand test systems having known three-dimensional structure. We show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three. The empirical free energy function was calibrated using a set of 30 structurally known protein–ligand complexes with experimentally determined binding constants. Linear regression analysis of the observed binding constants in terms of a wide variety of structure-derived molecular properties was performed. The final model had a residual standard error of 9.11 kJ mol−1 (2.177 kcal mol−1) and was chosen as the new energy function. The new search methods and empirical free energy function are available in AUTODOCK, version 3.0. © 1998 John Wiley & Sons, Inc. J Comput Chem 19: 1639–1662, 1998

Finding Structure in Time
Jeffrey L. Elman
1990· Cognitive Science10.8Kdoi:10.1207/s15516709cog1402_1

Time underlies many interesting human behaviors. Thus, the question of how to represent time in connectionist models is very important. One approach is to represent time implicitly by its effects on processing rather than explicitly (as in a spatial representation). The current report develops a proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory. In this approach, hidden unit patterns are fed back to themselves: the internal representations which develop thus reflect task demands in the context of prior internal states. A set of simulations is reported which range from relatively simple problems (temporal version of XOR) to discovering syntactic/semantic features for words. The networks are able to learn interesting internal representations which incorporate task demands with memory demands: indeed, in this approach the notion of memory is inextricably bound up with task processing. These representations reveal a rich structure, which allows them to be highly context‐dependent, while also expressing generalizations across classes of items. These representations suggest a method for representing lexical categories and the type/token distinction.

Design and validation of a histological scoring system for nonalcoholic fatty liver disease†
David E. Kleiner, Elizabeth M. Brunt, Mark L. Van Natta, Cynthia Behling +4 more
2005· Hepatology10.6Kdoi:10.1002/hep.20701

Nonalcoholic fatty liver disease (NAFLD) is characterized by hepatic steatosis in the absence of a history of significant alcohol use or other known liver disease. Nonalcoholic steatohepatitis (NASH) is the progressive form of NAFLD. The Pathology Committee of the NASH Clinical Research Network designed and validated a histological feature scoring system that addresses the full spectrum of lesions of NAFLD and proposed a NAFLD activity score (NAS) for use in clinical trials. The scoring system comprised 14 histological features, 4 of which were evaluated semi-quantitatively: steatosis (0-3), lobular inflammation (0-2), hepatocellular ballooning (0-2), and fibrosis (0-4). Another nine features were recorded as present or absent. An anonymized study set of 50 cases (32 from adult hepatology services, 18 from pediatric hepatology services) was assembled, coded, and circulated. For the validation study, agreement on scoring and a diagnostic categorization ("NASH," "borderline," or "not NASH") were evaluated by using weighted kappa statistics. Inter-rater agreement on adult cases was: 0.84 for fibrosis, 0.79 for steatosis, 0.56 for injury, and 0.45 for lobular inflammation. Agreement on diagnostic category was 0.61. Using multiple logistic regression, five features were independently associated with the diagnosis of NASH in adult biopsies: steatosis (P = .009), hepatocellular ballooning (P = .0001), lobular inflammation (P = .0001), fibrosis (P = .0001), and the absence of lipogranulomas (P = .001). The proposed NAS is the unweighted sum of steatosis, lobular inflammation, and hepatocellular ballooning scores. In conclusion, we present a strong scoring system and NAS for NAFLD and NASH with reasonable inter-rater reproducibility that should be useful for studies of both adults and children with any degree of NAFLD. NAS of > or =5 correlated with a diagnosis of NASH, and biopsies with scores of less than 3 were diagnosed as "not NASH."

Analysis of protein-coding genetic variation in 60,706 humans
Monkol Lek, Konrad J. Karczewski, Eric Vallabh Minikel, Kaitlin E. Samocha +4 more
2016· Nature10.3Kdoi:10.1038/nature19057

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.