Carnegie Mellon University
UniversityPittsburgh, Pennsylvania, United States
Research output, citation impact, and the most-cited recent papers from Carnegie Mellon University (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Carnegie Mellon University
Wesley M. Cohen, Daniel A. Levinthal, Absorptive Capacity: A New Perspective on Learning and Innovation, Administrative Science Quarterly, Vol. 35, No. 1, Special Issue: Technology, Organizations, and Innovation (Mar., 1990), pp. 128-152
This paper presents evidence from three samples, two of college students and one of participants in a community smoking-cessation program, for the reliability and validity of a 14-item instrument, the Perceived Stress Scale (PSS), designed to measure the degree to which situations in one's life are appraised as stressful. The PSS showed adequate reliability and, as predicted, was correlated with life-event scores, depressive and physical symptomatology, utilization of health services, social anxiety, and smoking-reduction maintenance. In all comparisons, the PSS was a better predictor of the outcome in question than were life-event scores. When compared to a depressive symptomatology scale, the PSS was found to measure a different and independently predictive construct. Additional data indicate adequate reliability and validity of a four-item version of the PSS for telephone interviews. The PSS is suggested for examining the role of nonspecific appraised stress in the etiology of disease and behavioral disorders and as an outcome measure of experienced levels of stress.
A contracted Gaussian basis set (6-311G**) is developed by optimizing exponents and coefficients at the Mo/ller–Plesset (MP) second-order level for the ground states of first-row atoms. This has a triple split in the valence s and p shells together with a single set of uncontracted polarization functions on each atom. The basis is tested by computing structures and energies for some simple molecules at various levels of MP theory and comparing with experiment.
Examines whether the positive association between social support and well-being is attributable more to an overall beneficial effect of support (main- or direct-effect model) or to a process of support protecting persons from potentially adverse effects of stressful events (buffering model). The review of studies is organized according to (1) whether a measure assesses support structure (the existence of relationships) or function (the extent to which one's interpersonal relationships provide particular resources) and (2) the degree of specificity (vs globality) of the scale. Special attention is given to methodological characteristics that are requisite for a fair comparison of the models. It is concluded that there is evidence consistent with both models. Evidence for the buffering model is found when the social support measure assesses the perceived availability of interpersonal resources that are responsive to the needs elicited by stressful events. Evidence for a main effect model is found when the support measure assesses a person's degree of integration in a large social network. Both conceptualizations of social support are correct in some respects, but each represents a different process through which social support may affect well-being. Implications for theories of social support processes and for the design of preventive interventions are discussed.
In management contexts, mathematical programming is usually used to evaluate a collection of possible alternative courses of action en route to selecting one which is best. In this capacity, mathematical programming serves as a planning aid to management. Data Envelopment Analysis reverses this role and employs mathematical programming to obtain ex post facto evaluations of the relative efficiency of management accomplishments, however they may have been planned or executed. Mathematical programming is thereby extended for use as a tool for control and evaluation of past accomplishments as well as a tool to aid in planning future activities. The CCR ratio form introduced by Charnes, Cooper and Rhodes, as part of their Data Envelopment Analysis approach, comprehends both technical and scale inefficiencies via the optimal value of the ratio form, as obtained directly from the data without requiring a priori specification of weights and/or explicit delineation of assumed functional forms of relations between inputs and outputs. A separation into technical and scale efficiencies is accomplished by the methods developed in this paper without altering the latter conditions for use of DEA directly on observational data. Technical inefficiencies are identified with failures to achieve best possible output levels and/or usage of excessive amounts of inputs. Methods for identifying and correcting the magnitudes of these inefficiencies, as supplied in prior work, are illustrated. In the present paper, a new separate variable is introduced which makes it possible to determine whether operations were conducted in regions of increasing, constant or decreasing returns to scale (in multiple input and multiple output situations). The results are discussed and related not only to classical (single output) economics but also to more modern versions of economics which are identified with “contestable market theories.”
Two extended basis sets (termed 5–31G and 6–31G) consisting of atomic orbitals expressed as fixed linear combinations of Gaussian functions are presented for the first row atoms carbon to fluorine. These basis functions are similar to the 4–31G set [J. Chem. Phys. 54, 724 (1971)] in that each valence shell is split into inner and outer parts described by three and one Gaussian function, respectively. Inner shells are represented by a single basis function taken as a sum of five (5–31G) or six (6–31G) Gaussians. Studies with a number of polyatomic molecules indicate a substantial lowering of calculated total energies over the 4–31G set. Calculated relative energies and equilibrium geometries do not appear to be altered significantly.
We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.
Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs-30,000 auditory nerve fibers or 10(6) optic nerve fibers-a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.
We present, a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data. 1.
Abstract In a 1935 paper and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is one-half. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P-values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this article we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology, and psychology. We emphasize the following points: •From Jeffreys' Bayesian viewpoint, the purpose of hypothesis testing is to evaluate the evidence in favor of a scientific theory.•Bayes factors offer a way of evaluating evidence in favor of a null hypothesis.•Bayes factors provide a way of incorporating external information into the evaluation of evidence about a hypothesis.•Bayes factors are very general and do not require alternative models to be nested.•Several techniques are available for computing Bayes factors, including asymptotic approximations that are easy to compute using the output from standard packages that maximize likelihoods.•In “nonstandard” statistical models that do not satisfy common regularity conditions, it can be technically simpler to calculate Bayes factors than to derive non-Bayesian significance tests.•The Schwarz criterion (or BIC) gives a rough approximation to the logarithm of the Bayes factor, which is easy to use and does not require evaluation of prior distributions.•When one is interested in estimation or prediction, Bayes factors may be converted to weights to be attached to various models so that a composite estimate or prediction may be obtained that takes account of structural or model uncertainty.•Algorithms have been proposed that allow model uncertainty to be taken into account when the class of models initially considered is very large.•Bayes factors are useful for guiding an evolutionary model-building process.•It is important, and feasible, to assess the sensitivity of conclusions to the prior distributions used.
Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is faster because it examines far fewer potential matches between the images than existing techniques. Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show show our technique can be adapted for use in a stereo vision system. 1. Introduction Image registration finds a variety of applications in computer vision, such as image matching for stereo vision, pattern recognition, and motion analysis. Untortunately, existing techniques for image registration tend to be costly. Moreover, they generally fail to deal with rotation or other distortions of the images. In this paper we present a new image registratio...
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our non-local models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code is available at https://github.com/facebookresearch/video-nonlocal-net .
An extended basis set of atomic functions expressed as fixed linear combinations of Gaussian functions is presented for hydrogen and the first-row atoms carbon to fluorine. In this set, described as 4–31 G, each inner shell is represented by a single basis function taken as a sum of four Gaussians and each valence orbital is split into inner and outer parts described by three and one Gaussian function, respectively. The expansion coefficients and Gaussian exponents are determined by minimizing the total calculated energy of the atomic ground state. This basis set is then used in single-determinant molecular-orbital studies of a group of small polyatomic molecules. Optimization of valence-shell scaling factors shows that considerable rescaling of atomic functions occurs in molecules, the largest effects being observed for hydrogen and carbon. However, the range of optimum scale factors for each atom is small enough to allow the selection of a standard molecular set. The use of this standard basis gives theoretical equilibrium geometries in reasonable agreement with experiment.
The Sloan Digital Sky Survey (SDSS) will provide the data to support detailed investigations of the distribution of luminous and non- luminous matter in the Universe: a photometrically and astrometrically calibrated digital imaging survey of pi steradians above about Galactic latitude 30 degrees in five broad optical bands to a depth of g' about 23 magnitudes, and a spectroscopic survey of the approximately one million brightest galaxies and 10^5 brightest quasars found in the photometric object catalog produced by the imaging survey. This paper summarizes the observational parameters and data products of the SDSS, and serves as an introduction to extensive technical on-line documentation.
The article discusses trust theory, multidisciplinary research, and trust between organizations. The analysis of trust is based on four questions: whether scholars can agree on the meaning of trust; if researchers are viewing trust statistically; if the status of trust--cause, effect, or interaction--changes across disciplines; and whether the levels of analysis also change. The “bandwidth” of trust--where trust and distrust are differentiated--can vary over time in the same relationship or coexist at the same time. Bandwidth types are deterrence-based trust, calculus-based trust, relational trust, and institution-based trust. Two conditions of trust are risk and interdependence. Three phases are building, stability, and dissolution. Several studies are mentioned.
Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing.
In this paper we study both market risks and nonmarket risks, without complete markets assumption, and discuss methods of measurement of these risks. We present and justify a set of four desirable properties for measures of risk, and call the measures satisfying these properties “coherent.” We examine the measures of risk provided and the related actions required by SPAN, by the SEC/NASD rules, and by quantile‐based methods. We demonstrate the universality of scenario‐based methods for providing coherent measures. We offer suggestions concerning the SEC method. We also suggest a method to repair the failure of subadditivity of quantile‐based methods.
In this paper we present a new data structure for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by Lee [1] and Akers [2], but with further restrictions on the ordering of decision variables in the graph. Although a function requires, in the worst case, a graph of size exponential in the number of arguments, many of the functions encountered in typical applications have a more reasonable representation. Our algorithms have time complexity proportional to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large. We present experimental results from applying these algorithms to problems in logic design verification that demonstrate the practicality of our approach.
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
Context. We present the second Gaia data release, Gaia DR2, consisting of astrometry, photometry, radial velocities, and information on astrophysical parameters and variability, for sources brighter than magnitude 21. In addition epoch astrometry and photometry are provided for a modest sample of minor planets in the solar system. Aims. A summary of the contents of Gaia DR2 is presented, accompanied by a discussion on the differences with respect to Gaia DR1 and an overview of the main limitations which are still present in the survey. Recommendations are made on the responsible use of Gaia DR2 results. Methods. The raw data collected with the Gaia instruments during the first 22 months of the mission have been processed by the Gaia Data Processing and Analysis Consortium (DPAC) and turned into this second data release, which represents a major advance with respect to Gaia DR1 in terms of completeness, performance, and richness of the data products. Results. Gaia DR2 contains celestial positions and the apparent brightness in G for approximately 1.7 billion sources. For 1.3 billion of those sources, parallaxes and proper motions are in addition available. The sample of sources for which variability information is provided is expanded to 0.5 million stars. This data release contains four new elements: broad-band colour information in the form of the apparent brightness in the G BP (330–680 nm) and G RP (630–1050 nm) bands is available for 1.4 billion sources; median radial velocities for some 7 million sources are presented; for between 77 and 161 million sources estimates are provided of the stellar effective temperature, extinction, reddening, and radius and luminosity; and for a pre-selected list of 14 000 minor planets in the solar system epoch astrometry and photometry are presented. Finally, Gaia DR2 also represents a new materialisation of the celestial reference frame in the optical, the Gaia -CRF2, which is the first optical reference frame based solely on extragalactic sources. There are notable changes in the photometric system and the catalogue source list with respect to Gaia DR1, and we stress the need to consider the two data releases as independent. Conclusions. Gaia DR2 represents a major achievement for the Gaia mission, delivering on the long standing promise to provide parallaxes and proper motions for over 1 billion stars, and representing a first step in the availability of complementary radial velocity and source astrophysical information for a sample of stars in the Gaia survey which covers a very substantial fraction of the volume of our galaxy.