Beijing Institute of Big Data Research

facilityBeijing, China

Research output, citation impact, and the most-cited recent papers from Beijing Institute of Big Data Research (China). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works

5.2K

Citations

176.8K

h-index

156

i10-index

3.2K

Also known as

Beijing Institute of Big Data Research北京大数据研究院

Top-cited papers from Beijing Institute of Big Data Research

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

Bing Yu, Haoteng Yin, Zhanxing Zhu

20183.2Kdoi:10.24963/ijcai.2018/505

Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.

Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics

Linfeng Zhang, Jiequn Han, Handong Wang, Roberto Car +1 more

2018· Physical Review Letters2.2Kdoi:10.1103/physrevlett.120.143001

We introduce a scheme for molecular simulations, the deep potential molecular dynamics (DPMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data. The neural network model preserves all the natural symmetries in the problem. It is first-principles based in the sense that there are no ad hoc components aside from the network model. We show that the proposed scheme provides an efficient and accurate protocol in a variety of systems, including bulk materials and molecules. In all these cases, DPMD gives results that are essentially indistinguishable from the original data, at a cost that scales linearly with system size.

Solving high-dimensional partial differential equations using deep learning

Jiequn Han, Arnulf Jentzen, E Weinan

2018· Proceedings of the National Academy of Sciences1.7Kdoi:10.1073/pnas.1718942115

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality." This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their interrelationships.

Technologies and perspectives for achieving carbon neutrality

Fang Wang, Jean Damascene Harindintwali, Zhizhang Yuan, Min Wang +4 more

2021· The Innovation1.3Kdoi:10.1016/j.xinn.2021.100180

Global development has been heavily reliant on the overexploitation of natural resources since the Industrial Revolution. With the extensive use of fossil fuels, deforestation, and other forms of land-use change, anthropogenic activities have contributed to the ever-increasing concentrations of greenhouse gases (GHGs) in the atmosphere, causing global climate change. In response to the worsening global climate change, achieving carbon neutrality by 2050 is the most pressing task on the planet. To this end, it is of utmost importance and a significant challenge to reform the current production systems to reduce GHG emissions and promote the capture of CO2 from the atmosphere. Herein, we review innovative technologies that offer solutions achieving carbon (C) neutrality and sustainable development, including those for renewable energy production, food system transformation, waste valorization, C sink conservation, and C-negative manufacturing. The wealth of knowledge disseminated in this review could inspire the global community and drive the further development of innovative technologies to mitigate climate change and sustainably support human activities.

Cascaded Partial Decoder for Fast and Accurate Salient Object Detection

Zhe Wu, Li Su, Qingming Huang

20191.1Kdoi:10.1109/cvpr.2019.00403

Existing state-of-the-art salient object detection networks rely on aggregating multi-level features of pre-trained convolutional neural networks (CNNs). However, compared to high-level features, low-level features contribute less to performance. Meanwhile, they raise more computational cost because of their larger spatial resolutions. In this paper, we propose a novel Cascaded Partial Decoder (CPD) framework for fast and accurate salient object detection. On the one hand, the framework constructs partial decoder which discards larger resolution features of shallow layers for acceleration. On the other hand, we observe that integrating features of deep layers will obtain relatively precise saliency map. Therefore we directly utilize generated saliency map to recurrently optimize features of deep layers. This strategy efficiently suppresses distractors in the features and significantly improves their representation ability. Experiments conducted on five benchmark datasets exhibit that the proposed model not only achieves state-of-the-art but also runs much faster than existing models. Besides, we apply the proposed framework to optimize existing multi-level feature aggregation models and significantly improve their efficiency and accuracy.

Contemporary status of insecticide resistance in the major Aedes vectors of arboviruses infecting humans

Catherine L. Moyes, John Vontas, Ademir Jesus Martins, Lee Ching Ng +4 more

2017· PLoS neglected tropical diseases954doi:10.1371/journal.pntd.0005625

Both Aedes aegytpi and Ae. albopictus are major vectors of 5 important arboviruses (namely chikungunya virus, dengue virus, Rift Valley fever virus, yellow fever virus, and Zika virus), making these mosquitoes an important factor in the worldwide burden of infectious disease. Vector control using insecticides coupled with larval source reduction is critical to control the transmission of these viruses to humans but is threatened by the emergence of insecticide resistance. Here, we review the available evidence for the geographical distribution of insecticide resistance in these 2 major vectors worldwide and map the data collated for the 4 main classes of neurotoxic insecticide (carbamates, organochlorines, organophosphates, and pyrethroids). Emerging resistance to all 4 of these insecticide classes has been detected in the Americas, Africa, and Asia. Target-site mutations and increased insecticide detoxification have both been linked to resistance in Ae. aegypti and Ae. albopictus but more work is required to further elucidate metabolic mechanisms and develop robust diagnostic assays. Geographical distributions are provided for the mechanisms that have been shown to be important to date. Estimating insecticide resistance in unsampled locations is hampered by a lack of standardisation in the diagnostic tools used and by a lack of data in a number of regions for both resistance phenotypes and genotypes. The need for increased sampling using standard methods is critical to tackle the issue of emerging insecticide resistance threatening human health. Specifically, diagnostic doses and well-characterised susceptible strains are needed for the full range of insecticides used to control Ae. aegypti and Ae. albopictus to standardise measurement of the resistant phenotype, and calibrated diagnostic assays are needed for the major mechanisms of resistance.

VerifyNet: Secure and Verifiable Federated Learning

Guowen Xu, Hongwei Li, Sen Liu, Kan Yang +1 more

2019· IEEE Transactions on Information Forensics and Security784doi:10.1109/tifs.2019.2929409

As an emerging training model with neural networks, federated learning has received widespread attention due to its ability to update parameters without collecting users' raw data. However, since adversaries can track and derive participants' privacy from the shared gradients, federated learning is still exposed to various security and privacy threats. In this paper, we consider two major issues in the training process over deep neural networks (DNNs): 1) how to protect user's privacy (i.e., local gradients) in the training process and 2) how to verify the integrity (or correctness) of the aggregated results returned from the server. To solve the above problems, several approaches focusing on secure or privacy-preserving federated learning have been proposed and applied in diverse scenarios. However, it is still an open problem enabling clients to verify whether the cloud server is operating correctly, while guaranteeing user's privacy in the training process. In this paper, we propose VerifyNet, the first privacy-preserving and verifiable federated learning framework. In specific, we first propose a double-masking protocol to guarantee the confidentiality of users' local gradients during the federated learning. Then, the cloud server is required to provide the Proof about the correctness of its aggregated results to each user. We claim that it is impossible that an adversary can deceive users by forging Proof, unless it can solve the NP-hard problem adopted in our model. In addition, VerifyNet is also supportive of users dropping out during the training process. The extensive experiments conducted on real-world data also demonstrate the practical performance of our proposed scheme.

A Survey on Feature Selection

Jianyu Miao, Lingfeng Niu

2016· Procedia Computer Science552doi:10.1016/j.procs.2016.07.111

Feature selection, as a dimensionality reduction technique, aims to choosing a small subset of the relevant features from the original features by removing irrelevant, redundant or noisy features. Feature selection usually can lead to better learning performance, i.e., higher learning accuracy, lower computational cost, and better model interpretability. Recently, researchers from computer vision, text mining and so on have proposed a variety of feature selection algorithms and in terms of theory and experiment, show the effectiveness of their works. This paper is aimed at reviewing the state of the art on these techniques. Furthermore, a thorough experiment is conducted to check if the use of feature selection can improve the performance of learning, considering some of the approaches mentioned in the literature. The experimental results show that unsupervised feature selection algorithms benefits machine learning tasks improving the performance of clustering.

Multi-grained Attention Network for Aspect-Level Sentiment Classification

Feifan Fan, Yansong Feng, Dongyan Zhao

2018520doi:10.18653/v1/d18-1380

We propose a novel multi-grained attention network (MGAN) model for aspect level sentiment classification. Existing approaches mostly adopt coarse-grained attention mechanism, which may bring information loss if the aspect has multiple words or larger context. We propose a fine-grained attention mechanism, which can capture the word-level interaction between aspect and context. And then we leverage the fine-grained and coarsegrained attention mechanisms to compose the MGAN framework. Moreover, unlike previous works which train each aspect with its context separately, we design an aspect alignment loss to depict the aspect-level interactions among the aspects that have the same context. We evaluate the proposed approach on three datasets: laptop and restaurant are from SemEval 2014, and the last one is a twitter dataset. Experimental results show that the multi-grained attention network consistently outperforms the state-of-the-art methods on all three datasets. We also conduct experiments to evaluate the effectiveness of aspect alignment loss, which indicates the aspect-level interactions can bring extra useful information and further improve the performance.

A <i>Candida auris</i> Outbreak and Its Control in an Intensive Care Setting

David W. Eyre, Anna E. Sheppard, Hilary Madder, Ian Moir +4 more

2018· New England Journal of Medicine504doi:10.1056/nejmoa1714373

BACKGROUND: Candida auris is an emerging and multidrug-resistant pathogen. Here we report the epidemiology of a hospital outbreak of C. auris colonization and infection. METHODS: After identification of a cluster of C. auris infections in the neurosciences intensive care unit (ICU) of the Oxford University Hospitals, United Kingdom, we instituted an intensive patient and environmental screening program and package of interventions. Multivariable logistic regression was used to identify predictors of C. auris colonization and infection. Isolates from patients and from the environment were analyzed by whole-genome sequencing. RESULTS: A total of 70 patients were identified as being colonized or infected with C. auris between February 2, 2015, and August 31, 2017; of these patients, 66 (94%) had been admitted to the neurosciences ICU before diagnosis. Invasive C. auris infections developed in 7 patients. When length of stay in the neurosciences ICU and patient vital signs and laboratory results were controlled for, the predictors of C. auris colonization or infection included the use of reusable skin-surface axillary temperature probes (multivariable odds ratio, 6.80; 95% confidence interval [CI], 2.96 to 15.63; P<0.001) and systemic fluconazole exposure (multivariable odds ratio, 10.34; 95% CI, 1.64 to 65.18; P=0.01). C. auris was rarely detected in the general environment. However, it was detected in isolates from reusable equipment, including multiple axillary skin-surface temperature probes. Despite a bundle of infection-control interventions, the incidence of new cases was reduced only after removal of the temperature probes. All outbreak sequences formed a single genetic cluster within the C. auris South African clade. The sequenced isolates from reusable equipment were genetically related to isolates from the patients. CONCLUSIONS: The transmission of C. auris in this hospital outbreak was found to be linked to reusable axillary temperature probes, indicating that this emerging pathogen can persist in the environment and be transmitted in health care settings. (Funded by the National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance at Oxford University and others.).

Effect of Covid-19 Vaccination on Transmission of Alpha and Delta Variants

David W. Eyre, Donald Taylor, M. B. Purver, David Chapman +4 more

2022· New England Journal of Medicine503doi:10.1056/nejmoa2116597

BACKGROUND: Before the emergence of the B.1.617.2 (delta) variant of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), vaccination reduced transmission of SARS-CoV-2 from vaccinated persons who became infected, potentially by reducing viral loads. Although vaccination still lowers the risk of infection, similar viral loads in vaccinated and unvaccinated persons who are infected with the delta variant call into question the degree to which vaccination prevents transmission. METHODS: We used contact-testing data from England to perform a retrospective observational cohort study involving adult contacts of SARS-CoV-2-infected adult index patients. We used multivariable Poisson regression to investigate associations between transmission and the vaccination status of index patients and contacts and to determine how these associations varied with the B.1.1.7 (alpha) and delta variants and time since the second vaccination. RESULTS: Among 146,243 tested contacts of 108,498 index patients, 54,667 (37%) had positive SARS-CoV-2 polymerase-chain-reaction (PCR) tests. In index patients who became infected with the alpha variant, two vaccinations with either BNT162b2 or ChAdOx1 nCoV-19 (also known as AZD1222), as compared with no vaccination, were independently associated with reduced PCR positivity in contacts (adjusted rate ratio with BNT162b2, 0.32; 95% confidence interval [CI], 0.21 to 0.48; and with ChAdOx1 nCoV-19, 0.48; 95% CI, 0.30 to 0.78). Vaccine-associated reductions in transmission of the delta variant were smaller than those with the alpha variant, and reductions in transmission of the delta variant after two BNT162b2 vaccinations were greater (adjusted rate ratio for the comparison with no vaccination, 0.50; 95% CI, 0.39 to 0.65) than after two ChAdOx1 nCoV-19 vaccinations (adjusted rate ratio, 0.76; 95% CI, 0.70 to 0.82). Variation in cycle-threshold (Ct) values (indicative of viral load) in index patients explained 7 to 23% of vaccine-associated reductions in transmission of the two variants. The reductions in transmission of the delta variant declined over time after the second vaccination, reaching levels that were similar to those in unvaccinated persons by 12 weeks in index patients who had received ChAdOx1 nCoV-19 and attenuating substantially in those who had received BNT162b2. Protection in contacts also declined in the 3-month period after the second vaccination. CONCLUSIONS: Vaccination was associated with a smaller reduction in transmission of the delta variant than of the alpha variant, and the effects of vaccination decreased over time. PCR Ct values at diagnosis of the index patient only partially explained decreased transmission. (Funded by the U.K. Government Department of Health and Social Care and others.).

Universal Domain Adaptation

Kaichao You, Mingsheng Long, Zhangjie Cao, Jianmin Wang +1 more

2019482doi:10.1109/cvpr.2019.00283

Domain adaptation aims to transfer knowledge in the presence of the domain gap. Existing domain adaptation methods rely on rich prior knowledge about the relationship between the label sets of source and target domains, which greatly limits their application in the wild. This paper introduces Universal Domain Adaptation (UDA) that requires no prior knowledge on the label sets. For a given source label set and a target label set, they may contain a common label set and hold a private label set respectively, bringing up an additional category gap. UDA requires a model to either (1) classify the target sample correctly if it is associated with a label in the common label set, or (2) mark it as ``unknown'' otherwise. More importantly, a UDA model should work stably against a wide spectrum of commonness (the proportion of the common label set over the complete label set) so that it can handle real-world problems with unknown target label sets. To solve the universal domain adaptation problem, we propose Universal Adaptation Network (UAN). It quantifies sample-level transferability to discover the common label set and the label sets private to each domain, thereby promoting the adaptation in the automatically discovered common label set and recognizing the ``unknown'' samples successfully. A thorough evaluation shows that UAN outperforms the state of the art closed set, partial and open set domain adaptation methods in the novel UDA setting.

Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks

Jin Huang, Wayne Xin Zhao, Hongjian Dou, Ji-Rong Wen +1 more

2018474doi:10.1145/3209978.3210017

With the revival of neural networks, many studies try to adapt powerful sequential neural models, ıe Recurrent Neural Networks (RNN), to sequential recommendation. RNN-based networks encode historical interaction records into a hidden state vector. Although the state vector is able to encode sequential dependency, it still has limited representation power in capturing complicated user preference. It is difficult to capture fine-grained user preference from the interaction sequence. Furthermore, the latent vector representation is usually hard to understand and explain. To address these issues, in this paper, we propose a novel knowledge enhanced sequential recommender. Our model integrates the RNN-based networks with Key-Value Memory Network (KV-MN). We further incorporate knowledge base (KB) information to enhance the semantic representation of KV-MN. RNN-based models are good at capturing sequential user preference, while knowledge-enhanced KV-MNs are good at capturing attribute-level user preference. By using a hybrid of RNNs and KV-MNs, it is expected to be endowed with both benefits from these two components. The sequential preference representation together with the attribute-level preference representation are combined as the final representation of user preference. With the incorporation of KB information, our model is also highly interpretable. To our knowledge, it is the first time that sequential recommender is integrated with external memories by leveraging large-scale KB information.

Association between Mental Disorders and Subsequent Medical Conditions

Natalie C. Momen, Oleguer Plana‐Ripoll, Esben Agerbo, Michael E. Benros +4 more

2020· New England Journal of Medicine472doi:10.1056/nejmoa1915784

BACKGROUND: Persons with mental disorders are at a higher risk than the general population for the subsequent development of certain medical conditions. METHODS: We used a population-based cohort from Danish national registries that included data on more than 5.9 million persons born in Denmark from 1900 through 2015 and followed them from 2000 through 2016, for a total of 83.9 million person-years. We assessed 10 broad types of mental disorders and 9 broad categories of medical conditions (which encompassed 31 specific conditions). We used Cox regression models to calculate overall hazard ratios and time-dependent hazard ratios for pairs of mental disorders and medical conditions, after adjustment for age, sex, calendar time, and previous mental disorders. Absolute risks were estimated with the use of competing-risks survival analyses. RESULTS: A total of 698,874 of 5,940,299 persons (11.8%) were identified as having a mental disorder. The median age of the total population was 32.1 years at entry into the cohort and 48.7 years at the time of the last follow-up. Persons with a mental disorder had a higher risk than those without such disorders with respect to 76 of 90 pairs of mental disorders and medical conditions. The median hazard ratio for an association between a mental disorder and a medical condition was 1.37. The lowest hazard ratio was 0.82 for organic mental disorders and the broad category of cancer (95% confidence interval [CI], 0.80 to 0.84), and the highest was 3.62 for eating disorders and urogenital conditions (95% CI, 3.11 to 4.22). Several specific pairs showed a reduced risk (e.g., schizophrenia and musculoskeletal conditions). Risks varied according to the time since the diagnosis of a mental disorder. The absolute risk of a medical condition within 15 years after a mental disorder was diagnosed varied from 0.6% for a urogenital condition among persons with a developmental disorder to 54.1% for a circulatory disorder among those with an organic mental disorder. CONCLUSIONS: Most mental disorders were associated with an increased risk of a subsequent medical condition; hazard ratios ranged from 0.82 to 3.62 and varied according to the time since the diagnosis of the mental disorder. (Funded by the Danish National Research Foundation and others; COMO-GMC ClinicalTrials.gov number, NCT03847753.).

Stacked Cross Refinement Network for Edge-Aware Salient Object Detection

Zhe Wu, Li Su, Qingming Huang

2019463doi:10.1109/iccv.2019.00736

Salient object detection is a fundamental computer vision task. The majority of existing algorithms focus on aggregating multi-level features of pre-trained convolutional neural networks. Moreover, some researchers attempt to utilize edge information for auxiliary training. However, existing edge-aware models design unidirectional frameworks which only use edge features to improve the segmentation features. Motivated by the logical interrelations between binary segmentation and edge maps, we propose a novel Stacked Cross Refinement Network (SCRN) for salient object detection in this paper. Our framework aims to simultaneously refine multi-level features of salient object detection and edge detection by stacking Cross Refinement Unit (CRU). According to the logical interrelations, the CRU designs two direction-specific integration operations, and bidirectionally passes messages between the two tasks. Incorporating the refined edge-preserving features with the typical U-Net, our model detects salient objects accurately. Extensive experiments conducted on six benchmark datasets demonstrate that our method outperforms existing state-of-the-art algorithms in both accuracy and efficiency. Besides, the attribute-based performance on the SOC dataset show that the proposed model ranks first in the majority of challenging scenes. Code can be found at https://github.com/wuzhe71/SCAN.

Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics

Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long +2 more

2019458doi:10.1109/cvpr.2019.00937

Natural spatiotemporal processes can be highly non-stationary in many ways, e.g. the low-level non-stationarity such as spatial correlations or temporal dependencies of local pixel values; and the high-level variations such as the accumulation, deformation or dissipation of radar echoes in precipitation forecasting. From Cramer's Decomposition, any non-stationary process can be decomposed into deterministic, time-variant polynomials, plus a zero-mean stochastic term. By applying differencing operations appropriately, we may turn time-variant polynomials into a constant, making the deterministic component predictable. However, most previous recurrent neural networks for spatiotemporal prediction do not use the differential signals effectively, and their relatively simple state transition functions prevent them from learning too complicated variations in spacetime. We propose the Memory In Memory (MIM) networks and corresponding recurrent blocks for this purpose. The MIM blocks exploit the differential signals between adjacent recurrent states to model the non-stationary and approximately stationary properties in spatiotemporal dynamics with two cascaded, self-renewed memory modules. By stacking multiple MIM blocks, we could potentially handle higher-order non-stationarity. The MIM networks achieve the state-of-the-art results on four spatiotemporal prediction tasks across both synthetic and real-world datasets. We believe that the general idea of this work can be potentially applied to other time-series forecasting tasks.

The expressive power of neural networks: a view from the width

Lu Zhou, Hongming Pu, Feicheng Wang, Zhiqiang Hu +1 more

2017· Neural Information Processing Systems446

The expressive power of neural networks is important for understanding deep learning. Most existing works consider this problem from the view of the depth of a network. In this paper, we study how width affects the expressiveness of neural networks. Classical results state that depth-bounded (e.g. depth-2) networks with suitable activation functions are universal approximators. We show a universal approximation theorem for width-bounded ReLU networks: width-(n + 4) ReLU networks, where n is the input dimension, are universal approximators. Moreover, except for a measure zero set, all functions cannot be approximated by width-n ReLU networks, which exhibits a phase transition. Several recent works demonstrate the benefits of depth by proving the depth-efficiency of neural networks. That is, there are classes of deep networks which cannot be realized by any shallow network whose size is no more than an exponential bound. Here we pose the dual question on the width-efficiency of ReLU networks: Are there wide networks that cannot be realized by narrow networks whose size is not substantially larger? We show that there exist classes of wide networks which cannot be realized by any narrow network whose depth is no more than a polynomial bound. On the other hand, we demonstrate by extensive experiments that narrow networks whose size exceed the polynomial bound by a constant factor can approximate wide and shallow network with high accuracy. Our results provide more comprehensive evidence that depth may be more effective than width for the expressiveness of ReLU networks.

Counterfactual VQA: A Cause-Effect Look at Language Bias

Yulei Niu, Kaihua Tang, Hanwang Zhang, Zhiwu Lu +2 more

2021389doi:10.1109/cvpr46437.2021.01251

VQA models may tend to rely on language bias as a shortcut and thus fail to sufficiently learn the multi-modal knowledge from both vision and language. Recent debiasing methods proposed to exclude the language prior during inference. However, they fail to disentangle the "good" language context and "bad" language bias from the whole. In this paper, we investigate how to mitigate language bias in VQA. Motivated by causal effects, we proposed a novel counterfactual inference framework, which enables us to capture the language bias as the direct causal effect of questions on answers and reduce the language bias by subtracting the direct language effect from the total causal effect. Experiments demonstrate that our proposed counterfactual inference framework 1) is general to various VQA backbones and fusion strategies, 2) achieves competitive performance on the language-bias sensitive VQA-CP dataset while performs robustly on the balanced VQA v2 dataset without any augmented data. The code is available at https://github.com/yuleiniu/cfvqa.

Evaluating Object Hallucination in Large Vision-Language Models

Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang +2 more

2023346doi:10.18653/v1/2023.emnlp-main.20

Inspired by the superior language abilities of large language models (LLM), large vision-language models (LVLM) have been recently proposed by integrating powerful LLMs for improving the performance on complex multimodal tasks. Despite the promising progress on LVLMs, we find that they suffer from object hallucinations, i.e., they tend to generate objects inconsistent with the target images in the descriptions. To investigate it, this work presents the first systematic study on object hallucination of LVLMs. We conduct the evaluation experiments on several representative LVLMs, and show that they mostly suffer from severe object hallucination issues. We further discuss that the visual instructions may influence the hallucination, and find that: objects that frequently appear in the visual instructions or co-occur with the image objects are obviously prone to be hallucinated by LVLMs. Besides, we further design a polling-based query method called POPE for better evaluation of object hallucination. Experiment results show that our POPE can evaluate object hallucination in a more stable and flexible way.

GLC_FCS30D: the first global 30 m land-cover dynamics monitoring product with a fine classification system for the period from 1985 to 2022 generated using dense-time-series Landsat imagery and the continuous change-detection method

Xiao Zhang, Tingting Zhao, Xu Hong, Wendi Liu +3 more

2024· Earth system science data346doi:10.5194/essd-16-1353-2024

Abstract. Land-cover change has been identified as an important cause or driving force of global climate change and is a significant research topic. Over the past few decades, global land-cover mapping has progressed; however, long-time-series global land-cover-change monitoring data are still sparse, especially those at 30 m resolution. In this study, we describe GLC_FCS30D, a novel global 30 m land-cover dynamics monitoring dataset containing 35 land-cover subcategories and covering the period 1985–2022 in 26 time steps (maps were updated every 5 years before 2000 and annually after 2000). GLC_FCS30D has been developed using continuous change detection and all available Landsat imagery based on the Google Earth Engine platform. Specifically, we first take advantage of the continuous change-detection model and the full time series of Landsat observations to capture the time points of changed pixels and identify the temporally stable areas. Then, we apply a spatiotemporal refinement method to derive the globally distributed and high-confidence training samples from these temporally stable areas. Next, local adaptive classification models are used to update the land-cover information for the changed pixels, and a temporal-consistency optimization algorithm is adopted to improve their temporal stability and suppress some false changes. Further, the GLC_FCS30D product is validated using 84 526 globally distributed validation samples from 2020. It achieves an overall accuracy of 80.88 % (±0.27 %) for the basic classification system (10 major land-cover types) and 73.04 % (±0.30 %) for the LCCS (Land Cover Classification System) level-1 validation system (17 LCCS land-cover types). Meanwhile, two third-party time-series datasets used for validation from the United States and Europe Union are also collected for analyzing accuracy variations, and the results show that GLC_FCS30D offers significant stability in terms of variation across the accuracy time series and achieves mean accuracies of 79.50 % (±0.50 %) and 81.91 % (±0.09 %) over the two regions. Lastly, we draw conclusions about the global land-cover-change information from the GLC_FCS30D dataset; namely, that forest and cropland variations have dominated global land-cover change over past 37 years, the net loss of forests reached about 2.5 million km2, and the net gain in cropland area is approximately 1.3 million km2. Therefore, the novel dataset GLC_FCS30D is an accurate land-cover-dynamics time-series monitoring product that benefits from its diverse classification system, high spatial resolution, and long time span (1985–2022); thus, it will effectively support global climate change research and promote sustainable development analysis. The GLC_FCS30D dataset is available via https://doi.org/10.5281/zenodo.8239305 (Liu et al., 2023).

Search all NobleBlocks papers mentioning “Beijing Institute of Big Data Research” →