Shandong Institute of Automation

facilityJinan, China

Research output, citation impact, and the most-cited recent papers from Shandong Institute of Automation (China). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works

17.4K

Citations

997.8K

h-index

330

i10-index

15.9K

Also known as

Shandong Institute of Automation山东省科学院自动化研究所成

Top-cited papers from Shandong Institute of Automation

Dual Attention Network for Scene Segmentation

Jun Fu, Jing Liu, Haijie Tian, Yong Li +3 more

20196.8Kdoi:10.1109/cvpr.2019.00326

In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on the self-attention mechanism. Unlike previous works that capture contexts by multi-scale features fusion, we propose a Dual Attention Networks (DANet) to adaptively integrate local features with their global dependencies. Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively. The position attention module selectively aggregates the features at each position by a weighted sum of the features at all positions. Similar features would be related to each other regardless of their distances. Meanwhile, the channel attention module selectively emphasizes interdependent channel maps by integrating associated features among all channel maps. We sum the outputs of the two attention modules to further improve feature representation which contributes to more precise segmentation results. We achieve new state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset. In particular, a Mean IoU score of 81.5% on Cityscapes test set is achieved without using coarse data.

Traffic Flow Prediction With Big Data: A Deep Learning Approach

Yisheng Lv, Yanjie Duan, Wenwen Kang, Zhengxi Li +1 more

2014· IEEE Transactions on Intelligent Transportation Systems3.0Kdoi:10.1109/tits.2014.2345663

Accurate and timely traffic flow information is important for the successful deployment of intelligent transportation systems. Over the last few years, traffic data have been exploding, and we have truly entered the era of big data for transportation. Existing traffic flow prediction methods mainly use shallow traffic prediction models and are still unsatisfying for many real-world applications. This situation inspires us to rethink the traffic flow prediction problem based on deep architecture models with big traffic data. In this paper, a novel deep-learning-based traffic flow prediction method is proposed, which considers the spatial and temporal correlations inherently. A stacked autoencoder model is used to learn generic traffic flow features, and it is trained in a greedy layerwise fashion. To the best of our knowledge, this is the first time that a deep architecture model is applied using autoencoders as building blocks to represent traffic flow features for prediction. Moreover, experiments demonstrate that the proposed method for traffic flow prediction has superior performance.

A Survey on Evaluation of Large Language Models

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu +4 more

2024· ACM Transactions on Intelligent Systems and Technology2.4Kdoi:10.1145/3641289

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate , where to evaluate , and how to evaluate . Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, education, natural and social sciences, agent applications, and other areas. Secondly, we answer the ‘where’ and ‘how’ questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing the performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey

Person re-identification by Local Maximal Occurrence representation and metric learning

Shengcai Liao, Yang Hu, Xiangyu Zhu, Stan Z. Li

20152.2Kdoi:10.1109/cvpr.2015.7298832

Person re-identification is an important technique towards automatic search of a person's presence in a surveillance video. Two fundamental problems are critical for person re-identification, feature representation and metric learning. An effective feature representation should be robust to illumination and viewpoint changes, and a discriminant metric should be learned to match various person images. In this paper, we propose an effective feature representation called Local Maximal Occurrence (LOMO), and a subspace and metric learning method called Cross-view Quadratic Discriminant Analysis (XQDA). The LOMO feature analyzes the horizontal occurrence of local features, and maximizes the occurrence to make a stable representation against viewpoint changes. Besides, to handle illumination variations, we apply the Retinex transform and a scale invariant texture operator. To learn a discriminant metric, we propose to learn a discriminant low dimensional subspace by cross-view quadratic discriminant analysis, and simultaneously, a QDA metric is learned on the derived subspace. We also present a practical computation method for XQDA, as well as its regularization. Experiments on four challenging person re-identification databases, VIPeR, QMUL GRID, CUHK Campus, and CUHK03, show that the proposed method improves the state-of-the-art rank-1 identification rates by 2.2%, 4.88%, 28.91%, and 31.55% on the four databases, respectively.

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

20192.0Kdoi:10.1109/cvpr.2019.01230

In skeleton-based action recognition, graph convolutional networks (GCNs), which model the human body skeletons as spatiotemporal graphs, have achieved remarkable performance. However, in existing GCN-based methods, the topology of the graph is set manually, and it is fixed over all layers and input samples. This may not be optimal for the hierarchical GCN and diverse samples in action recognition tasks. In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods. In this work, we propose a novel two-stream adaptive graph convolutional network (2s-AGCN) for skeleton-based action recognition. The topology of the graph in our model can be either uniformly or individually learned by the BP algorithm in an end-to-end manner. This data-driven method increases the flexibility of the model for graph construction and brings more generality to adapt to various data samples. Moreover, a two-stream framework is proposed to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy. Extensive experiments on the two large-scale datasets, NTU-RGBD and Kinetics-Skeleton, demonstrate that the performance of our model exceeds the state-of-the-art with a significant margin.

Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture

C. L. Philip Chen, Zhulin Liu

2017· IEEE Transactions on Neural Networks and Learning Systems1.9Kdoi:10.1109/tnnls.2017.2716952

Broad Learning System (BLS) that aims to offer an alternative way of learning in deep structure is proposed in this paper. Deep structure and learning suffer from a time-consuming training process because of a large number of connecting parameters in filters and layers. Moreover, it encounters a complete retraining process if the structure is not sufficient to model the system. The BLS is established in the form of a flat network, where the original inputs are transferred and placed as "mapped features" in feature nodes and the structure is expanded in wide sense in the "enhancement nodes." The incremental learning algorithms are developed for fast remodeling in broad expansion without a retraining process if the network deems to be expanded. Two incremental learning algorithms are given for both the increment of the feature nodes (or filters in deep structure) and the increment of the enhancement nodes. The designed model and algorithms are very versatile for selecting a model rapidly. In addition, another incremental learning is developed for a system that has been modeled encounters a new incoming input. Specifically, the system can be remodeled in an incremental way without the entire retraining from the beginning. Satisfactory result for model reduction using singular value decomposition is conducted to simplify the final structure. Compared with existing deep neural networks, experimental results on the Modified National Institute of Standards and Technology database and NYU NORB object recognition dataset benchmark data demonstrate the effectiveness of the proposed BLS.

Data-Driven Intelligent Transportation Systems: A Survey

Junping Zhang, Fei‐Yue Wang, Kunfeng Wang, Wei-Hua Lin +2 more

2011· IEEE Transactions on Intelligent Transportation Systems1.8Kdoi:10.1109/tits.2011.2158001

For the last two decades, intelligent transportation systems (ITS) have emerged as an efficient way of improving the performance of transportation systems, enhancing travel security, and providing more choices to travelers. A significant change in ITS in recent years is that much more data are collected from a variety of sources and can be processed into various forms for different stakeholders. The availability of a large amount of data can potentially lead to a revolution in ITS development, changing an ITS from a conventional technology-driven system into a more powerful multifunctional data-driven intelligent transportation system (D 2 ITS) : a system that is vision, multisource, and learning algorithm driven to optimize its performance. Furthermore, D 2 ITS is trending to become a privacy-aware people-centric more intelligent system. In this paper, we provide a survey on the development of D 2 ITS, discussing the functionality of its key components and some deployment issues associated with D 2 ITS Future research directions for the development of D 2 ITS is also presented.

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

Lianghua Huang, Xin Zhao, Kaiqi Huang

2019· IEEE Transactions on Pattern Analysis and Machine Intelligence1.8Kdoi:10.1109/tpami.2019.2957464

We introduce here a large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k. Specifically, GOT-10k is built upon the backbone of WordNet structure [1] and it populates the majority of over 560 classes of moving objects and 87 motion patterns, magnitudes wider than the most recent similar-scale counterparts [19], [20], [23], [26]. By releasing the large high-diversity database, we aim to provide a unified training and evaluation platform for the development of class-agnostic, generic purposed short-term trackers. The features of GOT-10k and the contributions of this article are summarized in the following. (1) GOT-10k offers over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and stable evaluation of deep trackers. (2) GOT-10k is by far the first video trajectory dataset that uses the semantic hierarchy of WordNet to guide class population, which ensures a comprehensive and relatively unbiased coverage of diverse moving objects. (3) For the first time, GOT-10k introduces the one-shot protocol for tracker evaluation, where the training and test classes are zero-overlapped. The protocol avoids biased evaluation results towards familiar objects and it promotes generalization in tracker development. (4) GOT-10k offers additional labels such as motion classes and object visible ratios, facilitating the development of motion-aware and occlusion-aware trackers. (5) We conduct extensive tracking experiments with 39 typical tracking algorithms and their variants on GOT-10k and analyze their results in this paper. (6) Finally, we develop a comprehensive platform for the tracking community that offers full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The annotations of GOT-10k's test data are kept private to avoid tuning parameters on it.

A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development

Tianyu Wu, Shizhu He, Jingping Liu, Siqi Sun +3 more

2023· IEEE/CAA Journal of Automatica Sinica1.4Kdoi:10.1109/jas.2023.123618

ChatGPT, an artificial intelligence generated content (AIGC) model developed by OpenAI, has attracted world-wide attention for its capability of dealing with challenging language understanding and generation tasks in the form of conversations. This paper briefly provides an overview on the history, status quo and potential future development of ChatGPT, helping to provide an entry point to think about ChatGPT. Specifically, from the limited open-accessed resources, we conclude the core techniques of ChatGPT, mainly including large-scale language models, in-context learning, reinforcement learning from human feedback and the key technical steps for developing Chat-GPT. We further analyze the pros and cons of ChatGPT and we rethink the duality of ChatGPT in various fields. Although it has been widely acknowledged that ChatGPT brings plenty of opportunities for various fields, mankind should still treat and use ChatGPT properly to avoid the potential threat, e.g., academic integrity and safety challenge. Finally, we discuss several open problems as the potential development of ChatGPT.

Efficient Image Dehazing with Boundary Constraint and Contextual Regularization

Gaofeng Meng, Ying Wang, Jiangyong Duan, Shiming Xiang +1 more

20131.2Kdoi:10.1109/iccv.2013.82

Images captured in foggy weather conditions often suffer from bad visibility. In this paper, we propose an efficient regularization method to remove hazes from a single input image. Our method benefits much from an exploration on the inherent boundary constraint on the transmission function. This constraint, combined with a weighted L_1-norm based contextual regularization, is modeled into an optimization problem to estimate the unknown scene transmission. A quite efficient algorithm based on variable splitting is also presented to solve the problem. The proposed method requires only a few general assumptions and can restore a high-quality haze-free image with faithful colors and fine image details. Experimental results on a variety of haze images demonstrate the effectiveness and efficiency of the proposed method.

Disrupted small-world networks in schizophrenia

Yong Liu, Meng Liang, Yuan Zhou, Yong He +4 more

2008· Brain1.1Kdoi:10.1093/brain/awn018

The human brain has been described as a large, sparse, complex network characterized by efficient small-world properties, which assure that the brain generates and integrates information with high efficiency. Many previous neuroimaging studies have provided consistent evidence of 'dysfunctional connectivity' among the brain regions in schizophrenia; however, little is known about whether or not this dysfunctional connectivity causes disruption of the topological properties of brain functional networks. To this end, we investigated the topological properties of human brain functional networks derived from resting-state functional magnetic resonance imaging (fMRI). Data was obtained from 31 schizophrenia patients and 31 healthy subjects; then functional connectivity between 90 cortical and sub-cortical regions was estimated by partial correlation analysis and thresholded to construct a set of undirected graphs. Our findings demonstrated that the brain functional networks had efficient small-world properties in the healthy subjects; whereas these properties were disrupted in the patients with schizophrenia. Brain functional networks have efficient small-world properties which support efficient parallel information transfer at a relatively low cost. More importantly, in patients with schizophrenia the small-world topological properties are significantly altered in many brain regions in the prefrontal, parietal and temporal lobes. These findings are consistent with a hypothesis of dysfunctional integration of the brain in this illness. Specifically, we found that these altered topological measurements correlate with illness duration in schizophrenia. Detection and estimation of these alterations could prove helpful for understanding the pathophysiological mechanism as well as for evaluation of the severity of schizophrenia.

An SVD-based watermarking scheme for protecting rightful ownership

Ruizhen Liu, Tieniu Tan

2002· IEEE Transactions on Multimedia990doi:10.1109/6046.985560

Digital watermarking has been proposed as a solution to the problem of copyright protection of multimedia documents in networked environments. There are two important issues that watermarking algorithms need to address. First, watermarking schemes are required to provide trustworthy evidence for protecting rightful ownership. Second, good watermarking schemes should satisfy the requirement of robustness and resist distortions due to common image manipulations (such as filtering, compression, etc.). In this paper, we propose a novel watermarking algorithm based on singular value decomposition (SVD). Analysis and experimental results show that the new watermarking method performs well in both security and robustness.

The Applications of Radiomics in Precision Diagnosis and Treatment of Oncology: Opportunities and Challenges

Zhenyu Liu, Shuo Wang, Di Dong, Jingwei Wei +4 more

2019· Theranostics987doi:10.7150/thno.30309

Medical imaging can assess the tumor and its environment in their entirety, which makes it suitable for monitoring the temporal and spatial characteristics of the tumor. Progress in computational methods, especially in artificial intelligence for medical image process and analysis, has converted these images into quantitative and minable data associated with clinical events in oncology management. This concept was first described as radiomics in 2012. Since then, computer scientists, radiologists, and oncologists have gravitated towards this new tool and exploited advanced methodologies to mine the information behind medical images. On the basis of a great quantity of radiographic images and novel computational technologies, researchers developed and validated radiomic models that may improve the accuracy of diagnoses and therapy response assessments. Here, we review the recent methodological developments in radiomics, including data acquisition, tumor segmentation, feature extraction, and modelling, as well as the rapidly developing deep learning technology. Moreover, we outline the main applications of radiomics in diagnosis, treatment planning and evaluations in the field of oncology with the aim of developing quantitative and personalized medicine. Finally, we discuss the challenges in the field of radiomics and the scope and clinical applicability of these methods.

Personal identification based on iris texture analysis

Li Ma, Tieniu Tan, Yunhong Wang, Dexin Zhang

2003· IEEE Transactions on Pattern Analysis and Machine Intelligence956doi:10.1109/tpami.2003.1251145

With an increasing emphasis on security, automated personal identification based on biometrics has been receiving extensive attention over the past decade. Iris recognition, as an emerging biometric recognition approach, is becoming a very active topic in both research and practical applications. In general, a typical iris recognition system includes iris imaging, iris liveness detection, and recognition. This paper focuses on the last issue and describes a new scheme for iris recognition from an image sequence. We first assess the quality of each image in the input sequence and select a clear iris image from such a sequence for subsequent recognition. A bank of spatial filters, whose kernels are suitable for iris recognition, is then used to capture local characteristics of the iris so as to produce discriminating texture features. Experimental results show that the proposed method has an encouraging performance. In particular, a comparative study of existing methods for iris recognition is conducted on an iris image database including 2,255 sequences from 213 subjects. Conclusions based on such a comparison using a nonparametric statistical method (the bootstrap) provide useful information for further research.

Deep Metric Learning for Person Re-identification

Yi Dong, Zhen Lei, Shengcai Liao, Stan Z. Li

2014952doi:10.1109/icpr.2014.16

Various hand-crafted features and metric learning methods prevail in the field of person re-identification. Compared to these methods, this paper proposes a more general way that can learn a similarity metric from image pixels directly. By using a "siamese" deep neural network, the proposed method can jointly learn the color feature, texture feature and metric in a unified framework. The network has a symmetry structure with two sub-networks which are connected by a cosine layer. Each sub network includes two convolutional layers and a full connected layer. To deal with the big variations of person images, binomial deviance is used to evaluate the cost between similarities and labels, which is proved to be robust to outliers. Experiments on VIPeR illustrate the superior performance of our method and a cross database experiment also shows its good generalization.

Skeleton-Based Action Recognition With Directed Graph Neural Networks

Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

2019933doi:10.1109/cvpr.2019.00810

The skeleton data have been widely used for the action recognition tasks since they can robustly accommodate dynamic circumstances and complex backgrounds. In existing methods, both the joint and bone information in skeleton data have been proved to be of great help for action recognition tasks. However, how to incorporate these two types of data to best take advantage of the relationship between joints and bones remains a problem to be solved. In this work, we represent the skeleton data as a directed acyclic graph based on the kinematic dependency between the joints and bones in the natural human body. A novel directed graph neural network is designed specially to extract the information of joints, bones and their relations and make prediction based on the extracted features. In addition, to better fit the action recognition task, the topological structure of the graph is made adaptive based on the training process, which brings notable improvement. Moreover, the motion information of the skeleton sequence is exploited and combined with the spatial information to further enhance the performance in a two-stream framework. Our final model is tested on two large-scale datasets, NTU-RGBD and Skeleton-Kinetics, and exceeds state-of-the-art performance on both of them.

Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications

Fei‐Yue Wang

2010· IEEE Transactions on Intelligent Transportation Systems857doi:10.1109/tits.2010.2060218

Parallel control and management have been proposed as a new mechanism for conducting operations of complex systems, especially those that involved complexity issues of both engineering and social dimensions, such as transportation systems. This paper presents an overview of the background, concepts, basic methods, major issues, and current applications of Parallel transportation Management Systems (PtMS). In essence, parallel control and management is a data-driven approach for modeling, analysis, and decision-making that considers both the engineering and social complexity in its processes. The developments and applications described here clearly indicate that PtMS is effective for use in networked complex traffic systems and is closely related to emerging technologies in cloud computing, social computing, and cyberphysical-social systems. A description of PtMS system architectures, processes, and components, including OTSt, Dyna CAS, aDAPTS, iTOP, and TransWorld is presented and discussed. Finally, the experiments and examples of real-world applications are illustrated and analyzed.

Removal of Artifacts from EEG Signals: A Review

Jiang Xiao, Gui‐Bin Bian, Zean Tian

2019· Sensors806doi:10.3390/s19050987

Electroencephalogram (EEG) plays an important role in identifying brain activity and behavior. However, the recorded electrical activity always be contaminated with artifacts and then affect the analysis of EEG signal. Hence, it is essential to develop methods to effectively detect and extract the clean EEG data during encephalogram recordings. Several methods have been proposed to remove artifacts, but the research on artifact removal continues to be an open problem. This paper tends to review the current artifact removal of various contaminations. We first discuss the characteristics of EEG data and the types of different artifacts. Then, a general overview of the state-of-the-art methods and their detail analysis are presented. Lastly, a comparative analysis is provided for choosing a suitable methods according to particular application.

CASIA Image Tampering Detection Evaluation Database

Jing Dong, Wei Wang, Tieniu Tan

2013757doi:10.1109/chinasip.2013.6625374

Image forensics has now raised the anxiety of justice as increasing cases of abusing tampered images in newspapers and court for evidence are reported recently. With the goal of verifying image content authenticity, passive-blind image tampering detection is called for. More realistic open benchmark databases are also needed to assist the techniques. Recently, we collect a natural color image database with realistic tampering operations. The database is made publicly available for researchers to compare and evaluate their proposed tampering detection techniques. We call this database CASI-A Image Tampering Detection Evaluation Database. We describe the purpose, the design criterion, the organization and self-evaluation of this database in this paper.

Finite-Time Attitude Tracking Control of Spacecraft With Application to Attitude Synchronization

Haibo Du, Shihua Li, Chunjiang Qian

2011· IEEE Transactions on Automatic Control751doi:10.1109/tac.2011.2159419

This note investigates the finite-time attitude control problems for a single spacecraft and multiple spacecraft. First of all, a finite-time controller is designed to solve finite-time attitude tracking problem for a single spacecraft. Rigorous proof shows that the desired attitude can be tracked in finite time in the absence of disturbances. In the presence of disturbances, the tracking errors can reach a region around the origin in finite time. Then, based on the neighbor rule, a distributed finite-time attitude control law is proposed for a group of spacecraft with a leader-follower architecture. Under the finite-time control law, the attitude synchronization can be achieved in finite time.

Search all NobleBlocks papers mentioning “Shandong Institute of Automation” →