NobleBlocks

Advanced Digital Sciences Center

facilitySingapore, Singapore

Research output, citation impact, and the most-cited recent papers from Advanced Digital Sciences Center (Singapore). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
1.3K
Citations
86.1K
h-index
137
i10-index
1.1K
Also known as
Advanced Digital Sciences Center

Top-cited papers from Advanced Digital Sciences Center

LIME: Low-Light Image Enhancement via Illumination Map Estimation
Xiaojie Guo, Yu Li, Haibin Ling
2016· IEEE Transactions on Image Processing2.8Kdoi:10.1109/tip.2016.2639450

When one captures images in low-light conditions, the images often suffer from low visibility. Besides degrading the visual aesthetics of images, this poor quality may also significantly degenerate the performance of many computer vision and multimedia algorithms that are primarily designed for high-quality inputs. In this paper, we propose a simple yet effective low-light image enhancement (LIME) method. More concretely, the illumination of each pixel is first estimated individually by finding the maximum value in R, G, and B channels. Furthermore, we refine the initial illumination map by imposing a structure prior on it, as the final illumination map. Having the well-constructed illumination map, the enhancement can be achieved accordingly. Experiments on a number of challenging low-light images are present to reveal the efficacy of our LIME and show its superiority over several state-of-the-arts in terms of enhancement quality and efficiency.

A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications
Hongyun Cai, Vincent W. Zheng, Kevin Chen–Chuan Chang
2018· IEEE Transactions on Knowledge and Data Engineering2.0Kdoi:10.1109/tkde.2018.2807452

Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximumly preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work addresses these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques, and application scenarios.

PCANet: A Simple Deep Learning Baseline for Image Classification?
Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu +2 more
2015· IEEE Transactions on Image Processing1.3Kdoi:10.1109/tip.2015.2475625

In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.

Rain Streak Removal Using Layer Priors
Yu Li, Robby T. Tan, Xiaojie Guo, Jiangbo Lu +1 more
2016975doi:10.1109/cvpr.2016.299

This paper addresses the problem of rain streak removal from a single image. Rain streaks impair visibility of an image and introduce undesirable interference that can severely affect the performance of computer vision algorithms. Rain streak removal can be formulated as a layer decomposition problem, with a rain streak layer superimposed on a background layer containing the true scene content. Existing decomposition methods that address this problem employ either dictionary learning methods or impose a low rank structure on the appearance of the rain streaks. While these methods can improve the overall visibility, they tend to leave too many rain streaks in the background image or over-smooth the background image. In this paper, we propose an effective method that uses simple patch-based priors for both the background and rain layers. These priors are based on Gaussian mixture models and can accommodate multiple orientations and scales of the rain streaks. This simple approach removes rain streaks better than the existing methods qualitatively and quantitatively. We overview our method and demonstrate its effectiveness over prior work on a number of examples.

Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs
Dingqi Yang, Daqing Zhang, Vincent W. Zheng, Zhiyong Yu
2014· IEEE Transactions on Systems Man and Cybernetics Systems806doi:10.1109/tsmc.2014.2327053

With the recent surge of location based social networks (LBSNs), activity data of millions of users has become attainable. This data contains not only spatial and temporal stamps of user activity, but also its semantic information. LBSNs can help to understand mobile users' spatial temporal activity preference (STAP), which can enable a wide range of ubiquitous applications, such as personalized context-aware location recommendation and group-oriented advertisement. However, modeling such user-specific STAP needs to tackle high-dimensional data, i.e., user-location-time-activity quadruples, which is complicated and usually suffers from a data sparsity problem. In order to address this problem, we propose a STAP model. It first models the spatial and temporal activity preference separately, and then uses a principle way to combine them for preference inference. In order to characterize the impact of spatial features on user activity preference, we propose the notion of personal functional region and related parameters to model and infer user spatial activity preference. In order to model the user temporal activity preference with sparse user activity data in LBSNs, we propose to exploit the temporal activity similarity among different users and apply nonnegative tensor factorization to collaboratively infer temporal activity preference. Finally, we put forward a context-aware fusion framework to combine the spatial and temporal activity preference models for preference inference. We evaluate our proposed approach on three real-world datasets collected from New York and Tokyo, and show that our STAP model consistently outperforms the baseline approaches in various settings.

Discriminative Deep Metric Learning for Face Verification in the Wild
Junlin Hu, Jiwen Lu, Yap‐Peng Tan
2014691doi:10.1109/cvpr.2014.242

This paper presents a new discriminative deep metric learning (DDML) method for face verification in the wild. Different from existing metric learning-based face verification methods which aim to learn a Mahalanobis distance metric to maximize the inter-class variations and minimize the intra-class variations, simultaneously, the proposed DDML trains a deep neural network which learns a set of hierarchical nonlinear transformations to project face pairs into the same feature subspace, under which the distance of each positive face pair is less than a smaller threshold and that of each negative pair is higher than a larger threshold, respectively, so that discriminative information can be exploited in the deep network. Our method achieves very competitive face verification performance on the widely used LFW and YouTube Faces (YTF) datasets.

A data-driven approach to cleaning large face datasets
Hongwei Ng, Stefan Winkler
2014669doi:10.1109/icip.2014.7025068

Large face datasets are important for advancing face recognition research, but they are tedious to build, because a lot of work has to go into cleaning the huge amount of raw data. To facilitate this task, we describe an approach to building face datasets that starts with detecting faces in images returned from searches for public figures on the Internet, followed by discarding those not belonging to each queried person. We formulate the problem of identifying the faces to be removed as a quadratic programming problem, which exploits the observations that faces of the same person should look similar, have the same gender, and normally appear at most once per image. Our results show that this method can reliably clean a large dataset, leading to a considerable reduction in the work needed to build it. Finally, we are releasing the FaceScrub dataset that was created using this approach. It consists of 141,130 faces of 695 public figures and can be obtained from http://vintage.winklerbros.net/facescrub.html.

Robust visual tracking via multi-task sparse learning
Tianzhu Zhang, Bernard Ghanem, Si Liu, Narendra Ahuja
2012660doi:10.1109/cvpr.2012.6247908

In this paper, we formulate object tracking in a particle filter framework as a multi-task sparse learning problem, which we denote as Multi-Task Tracking (MTT). Since we model particles as linear combinations of dictionary templates that are updated dynamically, learning the representation of each particle is considered a single task in MTT. By employing popular sparsity-inducing ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p, q</sub> mixed norms (p ∈ {2, ∞} and q = 1), we regularize the representation problem to enforce joint sparsity and learn the particle representations together. As compared to previous methods that handle particles independently, our results demonstrate that mining the interdependencies between particles improves tracking performance and overall computational complexity. Interestingly, we show that the popular L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> tracker [15] is a special case of our MTT formulation (denoted as the L <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">11</sub> tracker) when p = q = 1. The learning problem can be efficiently solved using an Accelerated Proximal Gradient (APG) method that yields a sequence of closed form updates. As such, MTT is computationally attractive. We test our proposed approach on challenging sequences involving heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that MTT methods consistently outperform state-of-the-art trackers.

GMS: Grid-Based Motion Statistics for Fast, Ultra-Robust Feature Correspondence
Jia-Wang Bian, Wen-Yan Lin, Yasuyuki Matsushita, Sai-Kit Yeung +2 more
2017605doi:10.1109/cvpr.2017.302

Incorporating smoothness constraints into feature matching is known to enable ultra-robust matching. However, such formulations are both complex and slow, making them unsuitable for video applications. This paper proposes GMS (Grid-based Motion Statistics), a simple means of encapsulating motion smoothness as the statistical likelihood of a certain number of matches in a region. GMS enables translation of high match numbers into high match quality. This provides a real-time, ultra-robust correspondence system. Evaluation on videos, with low textures, blurs and wide-baselines show GMS consistently out-performs other real-time matchers and can achieve parity with more sophisticated, much slower techniques.

Deep hashing for compact binary codes learning
Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin +1 more
2015589doi:10.1109/cvpr.2015.7298862

In this paper, we propose a new deep hashing (DH) approach to learn compact binary codes for large scale visual search. Unlike most existing binary codes learning methods which seek a single linear projection to map each sample into a binary vector, we develop a deep neural network to seek multiple hierarchical non-linear transformations to learn these binary codes, so that the nonlinear relationship of samples can be well exploited. Our model is learned under three constraints at the top layer of the deep network: 1) the loss between the original real-valued feature descriptor and the learned binary vector is minimized, 2) the binary codes distribute evenly on each bit, and 3) different bits are as independent as possible. To further improve the discriminative power of the learned binary codes, we extend DH into supervised DH (SDH) by including one discriminative term into the objective function of DH which simultaneously maximizes the inter-class variations and minimizes the intra-class variations of the learned binary codes. Experimental results show the superiority of the proposed approach over the state-of-the-arts.

ASCERTAIN: Emotion and Personality Recognition Using Commercial Sensors
Ramanathan Subramanian, Julia Wache, Mojtaba Khomami Abadi, Radu L. Vieriu +2 more
2016· IEEE Transactions on Affective Computing532doi:10.1109/taffc.2016.2625250

We present ASCERTAIN-a multimodal databaASe for impliCit pERsonaliTy and Affect recognitIoN using commercial physiological sensors. To our knowledge, ASCERTAIN is the first database to connect personality traits and emotional states via physiological responses. ASCERTAIN contains big-five personality scales and emotional self-ratings of 58 users along with their Electroencephalogram (EEG), Electrocardiogram (ECG), Galvanic Skin Response (GSR) and facial activity data, recorded using off-the-shelf sensors while viewing affective movie clips. We first examine relationships between users' affective ratings and personality scales in the context of prior observations, and then study linear and non-linear physiological correlates of emotion and personality. Our analysis suggests that the emotion-personality relationship is better captured by non-linear rather than linear statistics. We finally attempt binary emotion and personality trait recognition using physiological features. Experimental results cumulatively confirm that personality differences are better revealed while comparing user responses to emotionally homogeneous videos, and above-chance recognition is achieved for both affective and personality dimensions.

Crowded Scene Analysis: A Survey
Teng Li, Huan Chang, Meng Wang, Bingbing Ni +2 more
2014· IEEE Transactions on Circuits and Systems for Video Technology487doi:10.1109/tcsvt.2014.2358029

Automated scene analysis has been a topic of great interest in computer vision and cognitive science. Recently, with the growth of crowd phenomena in the real world, crowded scene analysis has attracted much attention. However, the visual occlusions and ambiguities in crowded scenes, as well as the complex behaviors and scene semantics, make the analysis a challenging task. In the past few years, an increasing number of works on the crowded scene analysis have been reported, which covered different aspects including crowd motion pattern learning, crowd behavior and activity analyses, and anomaly detection in crowds. This paper surveys the state-of-the-art techniques on this topic. We first provide the background knowledge and the available features related to crowded scenes. Then, existing models, popular algorithms, evaluation protocols, and system performance are provided corresponding to different aspects of the crowded scene analysis. We also outline the available datasets for performance evaluation. Finally, some research problems and promising future directions are presented with discussions.

Neighborhood Repulsed Metric Learning for Kinship Verification
Jiwen Lu, Xiuzhuang Zhou, Yap-Pen Tan, Yuanyuan Shang +1 more
2013· IEEE Transactions on Pattern Analysis and Machine Intelligence430doi:10.1109/tpami.2013.134

Kinship verification from facial images is an interesting and challenging problem in computer vision, and there are very limited attempts on tackle this problem in the literature. In this paper, we propose a new neighborhood repulsed metric learning (NRML) method for kinship verification. Motivated by the fact that interclass samples (without a kinship relation) with higher similarity usually lie in a neighborhood and are more easily misclassified than those with lower similarity, we aim to learn a distance metric under which the intraclass samples (with a kinship relation) are pulled as close as possible and interclass samples lying in a neighborhood are repulsed and pushed away as far as possible, simultaneously, such that more discriminative information can be exploited for verification. To make better use of multiple feature descriptors to extract complementary information, we further propose a multiview NRML (MNRML) method to seek a common distance metric to perform multiple feature fusion to improve the kinship verification performance. Experimental results are presented to demonstrate the efficacy of our proposed methods. Finally, we also test human ability in kinship verification from facial images and our experimental results show that our methods are comparable to that of human observers.

Learning Compact Binary Face Descriptor for Face Recognition
Jiwen Lu, Venice Erin Liong, Xiuzhuang Zhou, Jie Zhou
2015· IEEE Transactions on Pattern Analysis and Machine Intelligence405doi:10.1109/tpami.2015.2408359

Binary feature descriptors such as local binary patterns (LBP) and its variations have been widely used in many face recognition systems due to their excellent robustness and strong discriminative power. However, most existing binary face descriptors are hand-crafted, which require strong prior knowledge to engineer them by hand. In this paper, we propose a compact binary face descriptor (CBFD) feature learning method for face representation and recognition. Given each face image, we first extract pixel difference vectors (PDVs) in local patches by computing the difference between each pixel and its neighboring pixels. Then, we learn a feature mapping to project these pixel difference vectors into low-dimensional binary vectors in an unsupervised manner, where 1) the variance of all binary codes in the training set is maximized, 2) the loss between the original real-valued codes and the learned binary codes is minimized, and 3) binary codes evenly distribute at each learned bin, so that the redundancy information in PDVs is removed and compact binary codes are obtained. Lastly, we cluster and pool these binary codes into a histogram feature as the final representation for each face image. Moreover, we propose a coupled CBFD (C-CBFD) method by reducing the modality gap of heterogeneous faces at the feature level to make our method applicable to heterogeneous face recognition. Extensive experimental results on five widely used face datasets show that our methods outperform state-of-the-art face descriptors.

Learning Community Embedding with Community Detection and Node Embedding on Graphs
Sandro Cavallari, Vincent W. Zheng, Hongyun Cai, Kevin Chen–Chuan Chang +1 more
2017373doi:10.1145/3132847.3132925

In this paper, we study an important yet largely under-explored setting of graph embedding, i.e., embedding communities instead of each individual nodes. We find that community embedding is not only useful for community-level applications such as graph visualization, but also beneficial to both community detection and node classification. To learn such embedding, our insight hinges upon a closed loop among community embedding, community detection and node embedding. On the one hand, node embedding can help improve community detection, which outputs good communities for fitting better community embedding. On the other hand, community embedding can be used to optimize the node embedding by introducing a community-aware high-order proximity. Guided by this insight, we propose a novel community embedding framework that jointly solves the three tasks together. We evaluate such a framework on multiple real-world datasets, and show that it improves graph visualization and outperforms state-of-the-art baselines in various application tasks, e.g., community detection and node classification.

COVERAGE — A novel database for copy-move forgery detection
Bihan Wen, Ye Zhu, Ramanathan Subramanian, Tian-Tsong Ng +2 more
2016369doi:10.1109/icip.2016.7532339

We present COVERAGE - a novel database containing copy-move forged images and their originals with similar but genuine objects. COVERAGE is designed to highlight and address tamper detection ambiguity of popular methods, caused by self-similarity within natural images. In COVERAGE, forged-original pairs are annotated with (i) the duplicated and forged region masks, and (ii) the tampering factor/similarity metric. For benchmarking, forgery quality is evaluated using (i) computer vision-based methods, and (ii) human detection performance. We also propose a novel sparsity-based metric for efficiently estimating forgery quality. Experimental results show that (a) popular forgery detection methods perform poorly over COVERAGE, and (b) the proposed sparsity based metric best correlates with human detection performance. We release the COVERAGE database to the research community.

Fast Global Image Smoothing Based on Weighted Least Squares
Dongbo Min, Sung‐Hwan Choi, Jiangbo Lu, Bumsub Ham +2 more
2014· IEEE Transactions on Image Processing367doi:10.1109/tip.2014.2366600

This paper presents an efficient technique for performing a spatially inhomogeneous edge-preserving image smoothing, called fast global smoother. Focusing on sparse Laplacian matrices consisting of a data term and a prior term (typically defined using four or eight neighbors for 2D image), our approach efficiently solves such global objective functions. In particular, we approximate the solution of the memory-and computation-intensive large linear system, defined over a d-dimensional spatial domain, by solving a sequence of 1D subsystems. Our separable implementation enables applying a linear-time tridiagonal matrix algorithm to solve d three-point Laplacian matrices iteratively. Our approach combines the best of two paradigms, i.e., efficient edge-preserving filters and optimization-based smoothing. Our method has a comparable runtime to the fast edge-preserving filters, but its global optimization formulation overcomes many limitations of the local filtering approaches. Our method also achieves high-quality results as the state-of-the-art optimization-based techniques, but runs ∼10-30 times faster. Besides, considering the flexibility in defining an objective function, we further propose generalized fast algorithms that perform Lγ norm smoothing (0 < γ < 2) and support an aggregated (robust) data term for handling imprecise data constraints. We demonstrate the effectiveness and efficiency of our techniques in a range of image processing and computer graphics applications.

DECAF: MEG-Based Multimodal Database for Decoding Affective Physiological Responses
Mojtaba Khomami Abadi, Ramanathan Subramanian, Seyed Mostafa Kia, Paolo Avesani +2 more
2015· IEEE Transactions on Affective Computing362doi:10.1109/taffc.2015.2392932

In this work, we present DECAF-a multimodal data set for decoding user physiological responses to affective multimedia content. Different from data sets such as DEAP [15] and MAHNOB-HCI [31], DECAF contains (1) brain signals acquired using the Magnetoencephalogram (MEG) sensor, which requires little physical contact with the user's scalp and consequently facilitates naturalistic affective response, and (2) explicit and implicit emotional responses of 30 participants to 40 one-minute music video segments used in [15] and 36 movie clips, thereby enabling comparisons between the EEG versus MEG modalities as well as movie versus music stimuli for affect recognition. In addition to MEG data, DECAF comprises synchronously recorded near-infra-red (NIR) facial videos, horizontal Electrooculogram (hEOG), Electrocardiogram (ECG), and trapezius-Electromyogram (tEMG) peripheral physiological responses. To demonstrate DECAF's utility, we present (i) a detailed analysis of the correlations between participants' self-assessments and their physiological responses and (ii) single-trial classification results for valence, arousal and dominance, with performance evaluation against existing data sets. DECAF also contains time-continuous emotion annotations for movie clips from seven users, which we use to demonstrate dynamic emotion prediction.

Region-Based Saliency Detection and Its Application in Object Recognition
Zhixiang Ren, Shenghua Gao, Liang-Tien Chia, Ivor W. Tsang
2013· IEEE Transactions on Circuits and Systems for Video Technology356doi:10.1109/tcsvt.2013.2280096

The objective of this paper is twofold. First, we introduce an effective region-based solution for saliency detection. Then, we apply the achieved saliency map to better encode the image features for solving object recognition task. To find the perceptually and semantically meaningful salient regions, we extract superpixels based on an adaptive mean shift algorithm as the basic elements for saliency detection. The saliency of each superpixel is measured by using its spatial compactness, which is calculated according to the results of Gaussian mixture model (GMM) clustering. To propagate saliency between similar clusters, we adopt a modified PageRank algorithm to refine the saliency map. Our method not only improves saliency detection through large salient region detection and noise tolerance in messy background, but also generates saliency maps with a well-defined object shape. Experimental results demonstrate the effectiveness of our method. Since the objects usually correspond to salient regions, and these regions usually play more important roles for object recognition than background, we apply our achieved saliency map for object recognition by incorporating a saliency map into sparse coding-based spatial pyramid matching (ScSPM) image representation. To learn a more discriminative codebook and better encode the features corresponding to the patches of the objects, we propose a weighted sparse coding for feature coding. Moreover, we also propose a saliency weighted max pooling to further emphasize the importance of those salient regions in feature pooling module. Experimental results on several datasets illustrate that our weighted ScSPM framework greatly outperforms ScSPM framework, and achieves excellent performance for object recognition.

Functional mechanism
Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang +1 more
2012· Proceedings of the VLDB Endowment349doi:10.14778/2350229.2350253

ε-differential privacy is the state-of-the-art model for releasing sensitive information while protecting privacy. Numerous methods have been proposed to enforce ε-differential privacy in various analytical tasks, e.g., regression analysis . Existing solutions for regression analysis, however, are either limited to non-standard types of regression or unable to produce accurate regression results. Motivated by this, we propose the Functional Mechanism , a differentially private method designed for a large class of optimization-based analyses. The main idea is to enforce ε-differential privacy by perturbing the objective function of the optimization problem, rather than its results. As case studies, we apply the functional mechanism to address two most widely used regression models, namely, linear regression and logistic regression . Both theoretical analysis and thorough experimental evaluations show that the functional mechanism is highly effective and efficient, and it significantly outperforms existing solutions.