Vienna University of Economics and Business
UniversityVienna, Vienna, Austria
Research output, citation impact, and the most-cited recent papers from Vienna University of Economics and Business (Austria). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Vienna University of Economics and Business
Recursive binary partitioning is a popular tool for regression analysis. Two fundamental problems of exhaustive search procedures usually applied to fit such models have been known for a long time: overfitting and a selection bias towards covariates with many possible splits or missing values. While pruning procedures are able to solve the overfitting problem, the variable selection bias still seriously affects the interpretability of tree-structured regression models. For some special cases unbiased procedures have been suggested, however lacking a common theoretical foundation. We propose a unified framework for recursive partitioning which embeds tree-structured regression models into a well defined theory of conditional inference procedures. Stopping criteria based on multiple test procedures are implemented and it is shown that the predictive performance of the resulting trees is as good as the performance of established exhaustive search procedures. It turns out that the partitions and therefore the models induced by both approaches are structurally different, confirming the need for an unbiased variable selection. Moreover, it is shown that the prediction accuracy of trees with early stopping is equivalent to the prediction accuracy of pruned trees with unbiased variable selection. The methodology presented here is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Data from studies on glaucoma classification, node positive breast cancer survival and mammography experience are re-analyzed.
BACKGROUND: Variable importance measures for random forests have been receiving increased attention as a means of variable selection in many classification tasks in bioinformatics and related scientific fields, for instance to select a subset of genetic markers relevant for the prediction of a certain disease. We show that random forest variable importance measures are a sensible means for variable selection in many applications, but are not reliable in situations where potential predictor variables vary in their scale of measurement or their number of categories. This is particularly important in genomics and computational biology, where predictors often include variables of different types, for example when predictors include both sequence data and continuous variables such as folding energy, or when amino acid sequence data show different numbers of categories. RESULTS: Simulation studies are presented illustrating that, when random forest variable importance measures are used with data of varying types, the results are misleading because suboptimal predictor variables may be artificially preferred in variable selection. The two mechanisms underlying this deficiency are biased variable selection in the individual classification trees used to build the random forest on one hand, and effects induced by bootstrap sampling with replacement on the other hand. CONCLUSION: We propose to employ an alternative implementation of random forests, that provides unbiased variable selection in the individual classification trees. When this method is applied using subsampling without replacement, the resulting variable importance measures can be used reliably for variable selection even in situations where the potential predictor variables vary in their scale of measurement or their number of categories. The usage of both random forest algorithms and their variable importance measures in the R system for statistical computing is illustrated and documented thoroughly in an application re-analyzing data from a study on RNA editing. Therefore the suggested method can be applied straightforwardly by scientists in bioinformatics research.
A comparison is undertaken between scale development and index construction procedures to trace the implications of adopting a reflective versus formative perspective when creating multi‐item measures for organizational research. Focusing on export coordination as an illustrative construct of interest, the results show that the choice of measurement perspective impacts on the content, parsimony and criterion validity of the derived coordination measures. Implications for practising researchers seeking to develop multi‐item measures of organizational constructs are considered.
BACKGROUND: Random forests are becoming increasingly popular in many scientific fields because they can cope with "small n large p" problems, complex interactions and even highly correlated predictor variables. Their variable importance measures have recently been suggested as screening tools for, e.g., gene expression studies. However, these variable importance measures show a bias towards correlated predictor variables. RESULTS: We identify two mechanisms responsible for this finding: (i) A preference for the selection of correlated predictors in the tree building process and (ii) an additional advantage for correlated predictor variables induced by the unconditional permutation scheme that is employed in the computation of the variable importance measure. Based on these considerations we develop a new, conditional permutation scheme for the computation of the variable importance measure. CONCLUSION: The resulting conditional variable importance reflects the true impact of each predictor variable more reliably than the original marginal approach.
The classical Poisson, geometric and negative binomial regression models for count data belong to the family of generalized linear models and are available at the core of the statistics toolbox in the R system for statistical computing. After reviewing the conceptual and computational features of these methods, a new implementation of hurdle and zero-inflated regression models in the functions <code>hurdle()</code> and <code>zeroinfl()</code> from the package <b>pscl</b> is introduced. It re-uses design and functionality of the basic R functions just as the underlying conceptual tools extend the classical models. Both hurdle and zero-inflated model, are able to incorporate over-dispersion and excess zeros-two problems that typically occur in count data sets in economics and the social sciences-better than their classical counterparts. Using cross-section data on the demand for medical care, it is illustrated how the classical as well as the zero-augmented models can be fitted, inspected and tested in practice.
The SAGE Handbook of Organizational Institutionalism brings together extensive coverage of aspects of Institutional Theory and an array of top academic contributors. Now in its Second Edition, the book has been thoroughly revised and reorganized, with all chapters updated to maintain a mix of theory, how to conduct institutional organizational analysis, and contemporary empirical work. New chapters on Translation, Networks and Institutional Pluralism are included to reflect new directions in the field. The Second Edition has also been reorganized into six parts
This article provides researchers with knowledge of how to design a high quality mixed methods research study. To design a mixed study, researchers must understand and carefully consider each of the dimensions of mixed methods design, and always keep an eye on the issue of validity. We explain the seven major design dimensions: purpose, theoretical drive, timing (simultaneity and dependency), point of integration, typological versus interactive design approaches, planned versus emergent design, and design complexity. There also are multiple secondary dimensions that need to be considered during the design process. We explain ten secondary dimensions of design to be considered for each research study. We also provide two case studies showing how the mixed designs were constructed.
Establishing predictive validity of measures is a major concern in marketing research. This paper investigates the conditions favoring the use of single items versus multi-item scales in terms of predictive validity. A series of complementary studies reveals that the predictive validity of single items varies considerably across different (concrete) constructs and stimuli objects. In an attempt to explain the observed instability, a comprehensive simulation study is conducted aimed at identifying the influence of different factors on the predictive validity of single versus multi-item measures. These include the average inter-item correlations in the predictor and criterion constructs, the number of items measuring these constructs, as well as the correlation patterns of multiple and single items between the predictor and criterion constructs. The simulation results show that, under most conditions typically encountered in practical applications, multi-item scales clearly outperform single items in terms of predictive validity. Only under very specific conditions do single items perform equally well as multi-item scales. Therefore, the use of single-item measures in empirical research should be approached with caution, and the use of such measures should be limited to special circumstances.
In this article, we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After some opening remarks, we motivate and contrast various graph-based data models, as well as languages used to query and validate knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We conclude with high-level future research directions for knowledge graphs.
This analysis demonstrates the relevance and robustness of the theory of planned behavior in the prediction of business start–up intentions and subsequent behavior based on longitudinal survey data (2011 and 2012; n = 969) from the adult population in Austria and Finland. By doing so, the study addresses two weaknesses in current research: the limited scope of samples used in the majority of prior studies and the scarcity of investigations studying the translation of entrepreneurial intentions into behavior. The paper discusses conceptual and methodological issues related to studying the intention–behavior relationship and outlines avenues for future research.
Ever since the Internet boom of the mid-1990s, firms have been experimenting with new ways of doing business and achieving their goals, which has led to a branching of the scholarly literature on business models. Three interpretations of the meaning and function of “business models” have emerged from the management literature: (1) business models as attributes of real firms, (2) business models as cognitive/linguistic schemas, and (3) business models as formal conceptual representations of how a business functions. Relatedly, a provocative debate about the relationship between business models and strategy has fascinated many scholars. We offer a critical review of this now vast business model literature with the goal of organizing the literature and achieving greater understanding of the larger picture in this increasingly important research area. In addition to complementing and extending prior reviews, we also aim at a second and more important contribution: We aim at identifying the reasons behind the apparent lack of agreement in the interpretation of business models, and the relationship between business models and strategy. Whether strategy scholars consider business model research a new field may be due to the fact that the business model perspective may be challenging the assumptions of traditional theories of value creation and capture by focusing on value creation on the demand side and supply side, rather than focusing on value creation on the supply side only as these theories have done. We conclude by discussing how the business model perspective can contribute to research in different fields, offering future research directions.
In recent years, food waste has received growing interest from local, national and European policymakers, international organisations, NGOs as well as academics from various disciplinary fields. Increasing concerns about food security and environmental impacts, such as resource depletion and greenhouse gas emissions attributed to food waste, have intensified attention to the topic. While food waste occurs in all stages of the food supply chain, private households have been identified as key actors in food waste generation. However, the evidence on why food waste occurs remains scattered. This paper maps the still small but expanding academic territory of consumer food waste by systematically reviewing empirical studies on food waste practices as well as distilling factors that foster and impede the generation of food waste on the household level. Moreover, we briefly discuss the contributions of different social ontologies, more particularly psychology-related approaches and social practice theory. The analysis reveals food waste as a complex and multi-faceted issue that cannot be attributed to single variables; this also calls for a stronger integration of different disciplinary perspectives. Mapping the determinants of waste generation deepens the understanding of household practices and helps design food waste prevention strategies. Finally, we link the identified factors with a set of policy, business, and retailer options.
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the <strong>tm</strong> package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classification and string kernels.
Over the past decades, the financial investment of non-financial businesses has been rising, and the accumulation of capital goods has been declining. The first part of the paper offers a novel theory to explain this phenomenon. Financialisation, the shareholder revolution and the development of a market for corporate control have shifted power to shareholders and thus changed management priorities, leading to a reduction in the desired growth rate. In the second part, the link between accumulation and financialisation is tested econometrically by means of a time series analysis of aggregate business investment for the USA, the UK, France and Germany. Extensive tests of robustness are performed. For the first three countries, evidence supporting the negative effect of financialisation on accumulation is found. Copyright 2004, Oxford University Press.
For the past few years, food safety has become an outstanding problem in China. Since traditional agri-food logistics pattern can not match the demands of the market anymore, building an agri-food supply chain traceability system is becoming more and more urgent. In this paper, we study the utilization and development situation of RFID (Radio-Frequency IDentification) and blockchain technology first, and then we analyze the advantages and disadvantages of using RFID and blockchain technology in building the agri-food supply chain traceability system; finally, we demonstrate the building process of this system. It can realize the traceability with trusted information in the entire agri-food supply chain, which would effectively guarantee the food safety, by gathering, transferring and sharing the authentic data of agri-food in production, processing, warehousing, distribution and selling links.
Summary Environmentally extended multiregional input‐output (EE MRIO) tables have emerged as a key framework to provide a comprehensive description of the global economy and analyze its effects on the environment. Of the available EE MRIO databases, EXIOBASE stands out as a database compatible with the System of Environmental‐Economic Accounting (SEEA) with a high sectorial detail matched with multiple social and environmental satellite accounts. In this paper, we present the latest developments realized with EXIOBASE 3—a time series of EE MRIO tables ranging from 1995 to 2011 for 44 countries (28 EU member plus 16 major economies) and five rest of the world regions. EXIOBASE 3 builds upon the previous versions of EXIOBASE by using rectangular supply‐use tables (SUTs) in a 163 industry by 200 products classification as the main building blocks. In order to capture structural changes, economic developments, as reported by national statistical agencies, were imposed on the available, disaggregated SUTs from EXIOBASE 2. These initial estimates were further refined by incorporating detailed data on energy, agricultural production, resource extraction, and bilateral trade. EXIOBASE 3 inherits the high level of environmental stressor detail from its precursor, with further improvement in the level of detail for resource extraction. To account for the expansion of the European Union (EU), EXIOBASE 3 was developed with the full EU28 country set (including the new member state Croatia). EXIOBASE 3 provides a unique tool for analyzing the dynamics of environmental pressures of economic activities over time.
This paper aims at exploring and discussing the possibilities of applying qualitative content analysis as a (text) interpretation method in case study research. First, case study research as a research strategy within qualitative social research is briefly presented. Then, a basic introduction to (qualitative) content analysis as an interpretation method for qualitative interviews and other data material is given. Finally the use of qualitative content analysis for developing case studies is examined and evaluated. The author argues in favor of both case study research as a research strategy and qualitative content analysis as a method of examination of data material and seeks to encourage the integration of qualitative content analysis into the data analysis in case study research. URN: urn:nbn:de:0114-fqs0601211
In response to the 2013 Update of the European Strategy for Particle Physics, the Future Circular Collider (FCC) study was launched, as an international collaboration hosted by CERN. This study covers a highest-luminosity high-energy lepton collider (FCC-ee) and an energy-frontier hadron collider (FCC-hh), which could, successively, be installed in the same 100 km tunnel. The scientific capabilities of the integrated FCC programme would serve the worldwide community throughout the 21st century. The FCC study also investigates an LHC energy upgrade, using FCC-hh technology. This document constitutes the second volume of the FCC Conceptual Design Report, devoted to the electron-positron collider FCC-ee. After summarizing the physics discovery opportunities, it presents the accelerator design, performance reach, a staged operation scenario, the underlying technologies, civil engineering, technical infrastructure, and an implementation plan. FCC-ee can be built with today's technology. Most of the FCC-ee infrastructure could be reused for FCC-hh. Combining concepts from past and present lepton colliders and adding a few novel elements, the FCC-ee design promises outstandingly high luminosity. This will make the FCC-ee a unique precision instrument to study the heaviest known particles (Z, W and H bosons and the top quark), offering great direct and indirect sensitivity to new physics.
On online social networks such as Facebook, massive self-disclosure by users has attracted the attention of Industry players and policymakers worldwide. Despite the Impressive scope of this phenomenon, very little Is understood about what motivates users to disclose personal Information. Integrating focus group results Into a theoretical privacy calculus framework, we develop and empirically test a Structural Equation Model of self-disclosure with 259 subjects. We find that users are primarily motivated to disclose Information because of the convenience of maintaining and developing relationships and platform enjoyment. Countervailing these benefits, privacy risks represent a critical barrier to information disclosure. However, users’ perception of risk can be mitigated by their trust in the network provider and availability of control options. Based on these findings, we offer recommendations for network providers.
Abstract. We define a subclass of dynamic linear models with unknown hyperpara‐meter called d ‐inverse‐gamma models. We then approximate the marginal probability density functions of the hyperparameter and the state vector by the data augmentation algorithm of Tanner and Wong. We prove that the regularity conditions for convergence hold. For practical implementation a forward‐filtering‐backward‐sampling algorithm is suggested, and the relation to Gibbs sampling is discussed in detail.