Laboratoire Jean Kuntzmann

facilityGrenoble, Auvergne-Rhône-Alpes, France

Research output, citation impact, and the most-cited recent papers from Laboratoire Jean Kuntzmann (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works

3.7K

Citations

154.6K

h-index

164

i10-index

1.9K

Also known as

Laboratoire Jean KuntzmannUMR 5224UMR5224

Top-cited papers from Laboratoire Jean Kuntzmann

Learning realistic human actions from movies

Ivan Laptev, Marcin Marszałek, Cordelia Schmid, Benjamin Rozenfeld

20083.5Kdoi:10.1109/cvpr.2008.4587756

The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contribution is to address this limitation and to investigate the use of movie scripts for automatic annotation of human actions in videos. We evaluate alternative methods for action retrieval from scripts and show benefits of a text-based classifier. Using the retrieved action samples for visual learning, we next turn to the problem of action classification in video. We present a new method for video classification that builds upon and extends several recent ideas including local space-time features, space-time pyramids and multi-channel non-linear SVMs. The method is shown to improve state-of-the-art results on the standard KTH action dataset by achieving 91.8% accuracy. Given the inherent problem of noisy labels in automatic annotation, we particularly investigate and show high tolerance of our method to annotation errors in the training set. We finally apply the method to learning and classifying challenging action classes in movies and show promising results.

Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions

Xiaoyang Tan, Bill Triggs

2010· IEEE Transactions on Image Processing2.8Kdoi:10.1109/tip.2010.2042645

Making recognition more reliable under uncontrolled lighting conditions is one of the most important challenges for practical face recognition systems. We tackle this by combining the strengths of robust illumination normalization, local texture-based face representations, distance transform based matching, kernel-based feature extraction and multiple feature fusion. Specifically, we make three main contributions: 1) we present a simple and efficient preprocessing chain that eliminates most of the effects of changing illumination while still preserving the essential appearance details that are needed for recognition; 2) we introduce local ternary patterns (LTP), a generalization of the local binary pattern (LBP) local texture descriptor that is more discriminant and less sensitive to noise in uniform regions, and we show that replacing comparisons based on local spatial histograms with a distance transform based similarity metric further improves the performance of LBP/LTP based face recognition; and 3) we further improve robustness by adding Kernel principal component analysis (PCA) feature extraction and incorporating rich local appearance cues from two complementary sources--Gabor wavelets and LBP--showing that the combination is considerably more accurate than either feature set alone. The resulting method provides state-of-the-art performance on three data sets that are widely used for testing recognition under difficult illumination conditions: Extended Yale-B, CAS-PEAL-R1, and Face Recognition Grand Challenge version 2 experiment 4 (FRGC-204). For example, on the challenging FRGC-204 data set it halves the error rate relative to previously published methods, achieving a face verification rate of 88.1% at 0.1% false accept rate. Further experiments show that our preprocessing method outperforms several existing preprocessors for a range of feature sets, data sets and lighting conditions.

Action recognition by dense trajectories

Heng Wang, Alexander Kläser, Cordelia Schmid, Cheng‐Lin Liu

20112.2Kdoi:10.1109/cvpr.2011.5995407

Feature trajectories have shown to be efficient for representing videos. Typically, they are extracted using the KLT tracker or matching SIFT descriptors between frames. However, the quality as well as quantity of these trajectories is often not sufficient. Inspired by the recent success of dense sampling in image classification, we propose an approach to describe videos by dense trajectories. We sample dense points from each frame and track them based on displacement information from a dense optical flow field. Given a state-of-the-art optical flow algorithm, our trajectories are robust to fast irregular motions as well as shot boundaries. Additionally, dense trajectories cover the motion information in videos well. We, also, investigate how to design descriptors to encode the trajectory information. We introduce a novel descriptor based on motion boundary histograms, which is robust to camera motion. This descriptor consistently outperforms other state-of-the-art descriptors, in particular in uncontrolled realistic videos. We evaluate our video description in the context of action classification with a bag-of-features approach. Experimental results show a significant improvement over the state of the art on four datasets of varying difficulty, i.e. KTH, YouTube, Hollywood2 and UCF sports.

Dynamic Triggering Mechanisms for Event-Triggered Control

Antoine Girard

2014· IEEE Transactions on Automatic Control1.9Kdoi:10.1109/tac.2014.2366855

In this technical note, we present a new class of event triggering mechanisms for event-triggered control systems. This class is characterized by the introduction of an internal dynamic variable, which motivates the proposed name of dynamic event triggering mechanism. The stability of the resulting closed-loop system is proved and the influence of design parameters on the decay rate of the Lyapunov function is discussed. For linear systems, we establish a lower bound on the inter-execution time as a function of the parameters. The influence of these parameters on a quadratic integral performance index is also studied. Some simulation results are provided for illustration of the theoretical claims.

Evaluation of CMIP6 DECK Experiments With CNRM‐CM6‐1

Aurore Voldoire, David Saint‐Martin, Stephane Sénési, Bertrand Decharme +4 more

2019· Journal of Advances in Modeling Earth Systems1.2Kdoi:10.1029/2019ms001683

Abstract This paper describes the main characteristics of CNRM‐CM6‐1, the fully coupled atmosphere‐ocean general circulation model of sixth generation jointly developed by Centre National de Recherches Météorologiques (CNRM) and Cerfacs for the sixth phase of the Coupled Model Intercomparison Project 6 (CMIP6). The paper provides a description of each component of CNRM‐CM6‐1, including the coupling method and the new online output software. We emphasize where model's components have been updated with respect to the former model version, CNRM‐CM5.1. In particular, we highlight major improvements in the representation of atmospheric and land processes. A particular attention has also been devoted to mass and energy conservation in the simulated climate system to limit long‐term drifts. The climate simulated by CNRM‐CM6‐1 is then evaluated using CMIP6 historical and Diagnostic, Evaluation and Characterization of Klima (DECK) experiments in comparison with CMIP5 CNRM‐CM5.1 equivalent experiments. Overall, the mean surface biases are of similar magnitude but with different spatial patterns. Deep ocean biases are generally reduced, whereas sea ice is too thin in the Arctic. Although the simulated climate variability remains roughly consistent with CNRM‐CM5.1, its sensitivity to rising CO 2 has increased: the equilibrium climate sensitivity is 4.9 K, which is now close to the upper bound of the range estimated from CMIP5 models.

DeepFlow: Large Displacement Optical Flow with Deep Matching

Philippe Weinzaepfel, Jérôme Revaud, Zaïd Harchaoui, Cordelia Schmid

2013995doi:10.1109/iccv.2013.175

Optical flow computation is a key component in many computer vision systems designed for tasks such as action detection or activity recognition. However, despite several major advances over the last decade, handling large displacement in optical flow remains an open problem. Inspired by the large displacement optical flow of Brox and Malik, our approach, termed Deep Flow, blends a matching algorithm with a variational approach for optical flow. We propose a descriptor matching algorithm, tailored to the optical flow problem, that allows to boost performance on fast motions. The matching algorithm builds upon a multi-stage architecture with 6 layers, interleaving convolutions and max-pooling, a construction akin to deep convolutional nets. Using dense sampling, it allows to efficiently retrieve quasi-dense correspondences, and enjoys a built-in smoothing effect on descriptors matches, a valuable asset for integration into an energy minimization framework for optical flow estimation. Deep Flow efficiently handles large displacements occurring in realistic videos, and shows competitive performance on optical flow benchmarks. Furthermore, it sets a new state-of-the-art on the MPI-Sintel dataset.

Learning from Synthetic Humans

Gül Varol, Javier Romero, Xavier Martín, Naureen Mahmood +3 more

2017953doi:10.1109/cvpr.2017.492

Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.

Long-Term Temporal Convolutions for Action Recognition

Gül Varol, Ivan Laptev, Cordelia Schmid

2017· IEEE Transactions on Pattern Analysis and Machine Intelligence952doi:10.1109/tpami.2017.2712608

Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate the importance of high-quality optical flow estimation for learning accurate action models. We report state-of-the-art results on two challenging benchmarks for human action recognition UCF101 (92.7%) and HMDB51 (67.2%).

Is that you? Metric learning approaches for face identification

Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid

2009761doi:10.1109/iccv.2009.5459197

Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3% and 87.5% correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks.

EpicFlow: Edge-preserving interpolation of correspondences for optical flow

Jérôme Revaud, Philippe Weinzaepfel, Zaïd Harchaoui, Cordelia Schmid

2015757doi:10.1109/cvpr.2015.7298720

We propose a novel approach for optical flow estimation, targeted at large displacements with significant occlusions. It consists of two steps: i) dense matching by edge-preserving interpolation from a sparse set of matches; ii) variational energy minimization initialized with the dense matches. The sparse-to-dense interpolation relies on an appropriate choice of the distance, namely an edge-aware geodesic distance. This distance is tailored to handle occlusions and motion boundaries - two common and difficult issues for optical flow computation. We also propose an approximation scheme for the geodesic distance to allow fast computation without loss of performance. Subsequent to the dense interpolation step, standard one-level variational energy minimization is carried out on the dense matches to obtain the final flow estimation. The proposed approach, called Edge-Preserving Interpolation of Correspondences (EpicFlow) is fast and robust to large displacements. It significantly outperforms the state of the art on MPI-Sintel and performs on par on Kitti and Middlebury.

LDpred2: better, faster, stronger

Florian Privé, Julyan Arbel, Bjarni J. Vilhjálmsson

2020· Bioinformatics755doi:10.1093/bioinformatics/btaa1029

MOTIVATION: Polygenic scores have become a central tool in human genetics research. LDpred is a popular method for deriving polygenic scores based on summary statistics and a matrix of correlation between genetic variants. However, LDpred has limitations that may reduce its predictive performance. RESULTS: Here, we present LDpred2, a new version of LDpred that addresses these issues. We also provide two new options in LDpred2: a 'sparse' option that can learn effects that are exactly 0, and an 'auto' option that directly learns the two LDpred parameters from data. We benchmark predictive performance of LDpred2 against the previous version on simulated and real data, demonstrating substantial improvements in robustness and predictive accuracy compared to LDpred1. We then show that LDpred2 also outperforms other polygenic score methods recently developed, with a mean AUC over the 8 real traits analyzed here of 65.1%, compared to 63.8% for lassosum, 62.9% for PRS-CS and 61.5% for SBayesR. Note that LDpred2 provides more accurate polygenic scores when run genome-wide, instead of per chromosome. AVAILABILITY AND IMPLEMENTATION: LDpred2 is implemented in R package bigsnpr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

SpectralGPT: Spectral Remote Sensing Foundation Model

Danfeng Hong, Bing Zhang, Xuyang Li, Yuxuan Li +4 more

2024· IEEE Transactions on Pattern Analysis and Machine Intelligence701doi:10.1109/tpami.2024.3362475

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner. While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT). Compared to existing foundation models, SpectralGPT 1) accommodates input images with varying sizes, resolutions, time series, and regions in a progressive training fashion, enabling full utilization of extensive RS Big Data; 2) leverages 3D token generation for spatial-spectral coupling; 3) captures spectrally sequential patterns via multi-target reconstruction; and 4) trains on one million spectral RS images, yielding models with over 600 million parameters. Our evaluation highlights significant performance improvements with pretrained SpectralGPT models, signifying substantial potential in advancing spectral RS Big Data applications within the field of geoscience across four downstream tasks: single/multi-label scene classification, semantic segmentation, and change detection.

TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid

2009699doi:10.1109/iccv.2009.5459266

Image auto-annotation is an important open problem in computer vision. For this task we propose TagProp, a discriminatively trained nearest neighbor model. Tags of test images are predicted using a weighted nearest-neighbor model to exploit labeled training images. Neighbor weights are based on neighbor rank or distance. TagProp allows the integration of metric learning by directly maximizing the log-likelihood of the tag predictions in the training set. In this manner, we can optimally combine a collection of image similarity metrics that cover different aspects of image content, such as local shape descriptors, or global color histograms. We also introduce a word specific sigmoidal modulation of the weighted neighbor tag predictions to boost the recall of rare words. We investigate the performance of different variants of our model and compare to existing work. We present experimental results for three challenging data sets. On all three, TagProp makes a marked improvement as compared to the current state-of-the-art.

Time-Frequency Reassignment and Synchrosqueezing: An Overview

François Auger, Patrick Flandrin, Yu-Ting Lin, Stephen McLaughlin +3 more

2013· IEEE Signal Processing Magazine614doi:10.1109/msp.2013.2265316

This article provides a general overview of time-frequency (T-F) reassignment and synchrosqueezing techniques applied to multicomponent signals, covering the theoretical background and applications. We explain how synchrosqueezing can be viewed as a special case of reassignment enabling mode reconstruction and place emphasis on the interest of using such T-F distributions throughout with illustrative examples.

The Future of Sensitivity Analysis: An essential discipline for systems modeling and policy support

Saman Razavi, Anthony J. Jakeman, Andrea Saltelli, Clémentine Prieur +4 more

2020· Environmental Modelling & Software585doi:10.1016/j.envsoft.2020.104954

Sensitivity analysis (SA) is en route to becoming an integral part of mathematical modeling. The tremendous potential benefits of SA are, however, yet to be fully realized, both for advancing mechanistic and data-driven modeling of human and natural systems, and in support of decision making. In this perspective paper, a multidisciplinary group of researchers and practitioners revisit the current status of SA, and outline research challenges in regard to both theoretical frameworks and their applications to solve real-world problems. Six areas are discussed that warrant further attention, including (1) structuring and standardizing SA as a discipline, (2) realizing the untapped potential of SA for systems modeling, (3) addressing the computational burden of SA, (4) progressing SA in the context of machine learning, (5) clarifying the relationship and role of SA to uncertainty quantification, and (6) evolving the use of SA in support of decision making. An outlook for the future of SA is provided that underlines how SA must underpin a wide variety of activities to better serve science and society.

The Antarctic ice core chronology (AICC2012): an optimized multi-parameter and multi-site dating approach for the last 120 thousand years

Daniel Vereş, Lucie Bazin, Amaëlle Landais, Habib Toye +4 more

2013· Climate of the past570doi:10.5194/cp-9-1733-2013

Abstract. The deep polar ice cores provide reference records commonly employed in global correlation of past climate events. However, temporal divergences reaching up to several thousand years (ka) exist between ice cores over the last climatic cycle. In this context, we are hereby introducing the Antarctic Ice Core Chronology 2012 (AICC2012), a new and coherent timescale developed for four Antarctic ice cores, namely Vostok, EPICA Dome C (EDC), EPICA Dronning Maud Land (EDML) and Talos Dome (TALDICE), alongside the Greenlandic NGRIP record. The AICC2012 timescale has been constructed using the Bayesian tool Datice (Lemieux-Dudon et al., 2010) that combines glaciological inputs and data constraints, including a wide range of relative and absolute gas and ice stratigraphic markers. We focus here on the last 120 ka, whereas the companion paper by Bazin et al. (2013) focuses on the interval 120–800 ka. Compared to previous timescales, AICC2012 presents an improved timing for the last glacial inception, respecting the glaciological constraints of all analyzed records. Moreover, with the addition of numerous new stratigraphic markers and improved calculation of the lock-in depth (LID) based on δ15N data employed as the Datice background scenario, the AICC2012 presents a slightly improved timing for the bipolar sequence of events over Marine Isotope Stage 3 associated with the seesaw mechanism, with maximum differences of about 600 yr with respect to the previous Datice-derived chronology of Lemieux-Dudon et al. (2010), hereafter denoted LD2010. Our improved scenario confirms the regional differences for the millennial scale variability over the last glacial period: while the EDC isotopic record (events of triangular shape) displays peaks roughly at the same time as the NGRIP abrupt isotopic increases, the EDML isotopic record (events characterized by broader peaks or even extended periods of high isotope values) reached the isotopic maximum several centuries before. It is expected that the future contribution of both other long ice core records and other types of chronological constraints to the Datice tool will lead to further refinements in the ice core chronologies beyond the AICC2012 chronology. For the time being however, we recommend that AICC2012 be used as the preferred chronology for the Vostok, EDC, EDML and TALDICE ice core records, both over the last glacial cycle (this study), and beyond (following Bazin et al., 2013). The ages for NGRIP in AICC2012 are virtually identical to those of GICC05 for the last 60.2 ka, whereas the ages beyond are independent of those in GICC05modelext (as in the construction of AICC2012, the GICC05modelext was included only via the background scenarios and not as age markers). As such, where issues of phasing between Antarctic records included in AICC2012 and NGRIP are involved, the NGRIP ages in AICC2012 should therefore be taken to avoid introducing false offsets. However for issues involving only Greenland ice cores, there is not yet a strong basis to recommend superseding GICC05modelext as the recommended age scale for Greenland ice cores.

An optimized multi-proxy, multi-site Antarctic ice and gas orbital chronology (AICC2012): 120–800 ka

Lucie Bazin, Amaëlle Landais, B. Lemieux-Dudon, Habib Toye +4 more

2013· Climate of the past513doi:10.5194/cp-9-1715-2013

Abstract. An accurate and coherent chronological framework is essential for the interpretation of climatic and environmental records obtained from deep polar ice cores. Until now, one common ice core age scale had been developed based on an inverse dating method (Datice), combining glaciological modelling with absolute and stratigraphic markers between 4 ice cores covering the last 50 ka (thousands of years before present) (Lemieux-Dudon et al., 2010). In this paper, together with the companion paper of Veres et al. (2013), we present an extension of this work back to 800 ka for the NGRIP, TALDICE, EDML, Vostok and EDC ice cores using an improved version of the Datice tool. The AICC2012 (Antarctic Ice Core Chronology 2012) chronology includes numerous new gas and ice stratigraphic links as well as improved evaluation of background and associated variance scenarios. This paper concentrates on the long timescales between 120–800 ka. In this framework, new measurements of δ18Oatm over Marine Isotope Stage (MIS) 11–12 on EDC and a complete δ18Oatm record of the TALDICE ice cores permit us to derive additional orbital gas age constraints. The coherency of the different orbitally deduced ages (from δ18Oatm, δO2/N2 and air content) has been verified before implementation in AICC2012. The new chronology is now independent of other archives and shows only small differences, most of the time within the original uncertainty range calculated by Datice, when compared with the previous ice core reference age scale EDC3, the Dome F chronology, or using a comparison between speleothems and methane. For instance, the largest deviation between AICC2012 and EDC3 (5.4 ka) is obtained around MIS 12. Despite significant modifications of the chronological constraints around MIS 5, now independent of speleothem records in AICC2012, the date of Termination II is very close to the EDC3 one.

Face recognition based on image sets

Hakan Çevıkalp, Bill Triggs

2010505doi:10.1109/cvpr.2010.5539965

We introduce a novel method for face recognition from image sets. In our setting each test and training example is a set of images of an individual's face, not just a single image, so recognition decisions need to be based on comparisons of image sets. Methods for this have two main aspects: the models used to represent the individual image sets; and the similarity metric used to compare the models. Here, we represent images as points in a linear or affine feature space and characterize each image set by a convex geometric region (the affine or convex hull) spanned by its feature points. Set dissimilarity is measured by geometric distances (distances of closest approach) between convex models. To reduce the influence of outliers we use robust methods to discard input points that are far from the fitted model. The kernel trick allows the approach to be extended to implicit feature mappings, thus handling complex and nonlinear manifolds of face images. Experiments on two public face datasets show that our proposed methods outperform a number of existing state-of-the-art ones.

High-Order Synchrosqueezing Transform for Multicomponent Signals Analysis—With an Application to Gravitational-Wave Signal

Duong-Hung Pham, Sylvain Meignen

2017· IEEE Transactions on Signal Processing503doi:10.1109/tsp.2017.2686355

This paper puts forward a generalization of the short-time Fourier-based synchrosqueezing transform using a new local estimate of instantaneous frequency. Such a technique enables not only to achieve a highly concentrated time-frequency representation for a wide variety of amplitude- and frequency-modulated multicomponent signals but also to reconstruct their modes with a high accuracy. Numerical investigation on synthetic and gravitational-wave signals shows the efficiency of this new approach.

Action Recognition from Arbitrary Views using 3D Exemplars

Daniel Weinland, Edmond Boyer, Rémi Ronfard

2007449doi:10.1109/iccv.2007.4408849

In this paper, we address the problem of learning compact, view-independent, realistic 3D models of human actions recorded with multiple cameras, for the purpose of recognizing those same actions from a single or few cameras, without prior knowledge about the relative orientations between the cameras and the subjects. To this aim, we propose a new framework where we model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM. The novelty is, that a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations. Parameters that describe image projections are added as latent variables in the recognition process. In addition, the temporal Markov dependency applied to view parameters allows them to evolve during recognition as with a smoothly moving camera. The effectiveness of the framework is demonstrated with experiments on real datasets and with challenging recognition scenarios.

Search all NobleBlocks papers mentioning “Laboratoire Jean Kuntzmann” →