Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes
facilitySaint-Étienne-du-Rouvray, France
Research output, citation impact, and the most-cited recent papers from Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes (France). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes
Abstract The problem of testing statistical hypotheses is an old one. Its origin is usually connected with the name of Thomas Bayes, who gave the well-known theorem on the probabilities a posteriori of the possible “causes“ of a given event. Since then it has been discussed by many writers of whom we shall here mention two only, Bertrand and Borel, whose differing views serve well to illustrate the point from which we shall approach the subject. Bertrand put into statistical form a variety of hypotheses, as for example the hypothesis that a given group of stars with relatively small angular distances between them as seen from the earth, form a “system” or group in space. His method of attack, which is that in common use, consisted essentially in calculating the probability, P, that a certain character, x, of the observed facts would arise if the hypothesis tested were true. If P were very small, this would generally be considered as an indication that the hypothesis, H, was probably false, and vice versa. Bertrand expressed the pessimistic view that no test of this kind could give reliable results. Borel, however, in a later discussion, considered that the method described could be applied with success provided that the character, x, of the observed facts were properly chosen—were, in fact, a character which he terms “en quelque sorte remarquable.”
OBJECTIVE: Most current electroencephalography (EEG)-based brain-computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. APPROACH: We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. MAIN RESULTS: We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. SIGNIFICANCE: This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.
Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.
BACKGROUND: Disorders affecting the nervous system are diverse and include neurodevelopmental disorders, late-life neurodegeneration, and newly emergent conditions, such as cognitive impairment following COVID-19. Previous publications from the Global Burden of Disease, Injuries, and Risk Factor Study estimated the burden of 15 neurological conditions in 2015 and 2016, but these analyses did not include neurodevelopmental disorders, as defined by the International Classification of Diseases (ICD)-11, or a subset of cases of congenital, neonatal, and infectious conditions that cause neurological damage. Here, we estimate nervous system health loss caused by 37 unique conditions and their associated risk factors globally, regionally, and nationally from 1990 to 2021. METHODS: We estimated mortality, prevalence, years lived with disability (YLDs), years of life lost (YLLs), and disability-adjusted life-years (DALYs), with corresponding 95% uncertainty intervals (UIs), by age and sex in 204 countries and territories, from 1990 to 2021. We included morbidity and deaths due to neurological conditions, for which health loss is directly due to damage to the CNS or peripheral nervous system. We also isolated neurological health loss from conditions for which nervous system morbidity is a consequence, but not the primary feature, including a subset of congenital conditions (ie, chromosomal anomalies and congenital birth defects), neonatal conditions (ie, jaundice, preterm birth, and sepsis), infectious diseases (ie, COVID-19, cystic echinococcosis, malaria, syphilis, and Zika virus disease), and diabetic neuropathy. By conducting a sequela-level analysis of the health outcomes for these conditions, only cases where nervous system damage occurred were included, and YLDs were recalculated to isolate the non-fatal burden directly attributable to nervous system health loss. A comorbidity correction was used to calculate total prevalence of all conditions that affect the nervous system combined. FINDINGS: Globally, the 37 conditions affecting the nervous system were collectively ranked as the leading group cause of DALYs in 2021 (443 million, 95% UI 378-521), affecting 3·40 billion (3·20-3·62) individuals (43·1%, 40·5-45·9 of the global population); global DALY counts attributed to these conditions increased by 18·2% (8·7-26·7) between 1990 and 2021. Age-standardised rates of deaths per 100 000 people attributed to these conditions decreased from 1990 to 2021 by 33·6% (27·6-38·8), and age-standardised rates of DALYs attributed to these conditions decreased by 27·0% (21·5-32·4). Age-standardised prevalence was almost stable, with a change of 1·5% (0·7-2·4). The ten conditions with the highest age-standardised DALYs in 2021 were stroke, neonatal encephalopathy, migraine, Alzheimer's disease and other dementias, diabetic neuropathy, meningitis, epilepsy, neurological complications due to preterm birth, autism spectrum disorder, and nervous system cancer. INTERPRETATION: As the leading cause of overall disease burden in the world, with increasing global DALY counts, effective prevention, treatment, and rehabilitation strategies for disorders affecting the nervous system are needed. FUNDING: Bill & Melinda Gates Foundation.
The performance of most conventional classification systems relies on appropriate data representation and much of the efforts are dedicated to feature engineering, a difficult and time-consuming process that uses prior expert domain knowledge of the data to create useful features. On the other hand, deep learning can extract and organize the discriminative information from the data, not requiring the design of feature extractors by a domain expert. Convolutional Neural Networks (CNNs) are a particular type of deep, feedforward network that have gained attention from research community and industry, achieving empirical successes in tasks such as speech recognition, signal processing, object recognition, natural language processing and transfer learning. In this paper, we conduct some preliminary experiments using the deep learning approach to classify breast cancer histopathological images from BreaKHis, a publicly dataset available at http://web.inf.ufpr.br/vri/breast-cancer-database. We propose a method based on the extraction of image patches for training the CNN and the combination of these patches for final classification. This method aims to allow using the high-resolution histopathological images from BreaKHis as input to existing CNN, avoiding adaptations of the model that can lead to a more complex and computationally costly architecture. The CNN performance is better when compared to previously reported results obtained by other machine learning models trained with hand-crafted textural descriptors. Finally, we also investigate the combination of different CNNs using simple fusion rules, achieving some improvement in recognition rates.
Domain adaptation is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data representation become more robust when confronted to data depicting the same classes, but described by another observation system. Among the many strategies proposed, finding domain-invariant representations has shown excellent properties, in particular since it allows to train a unique classifier effective in all domains. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains. We learn a transportation plan matching both PDFs, which constrains labeled samples of the same class in the source domain to remain close during transport. This way, we exploit at the same time the labeled samples in the source and the distributions observed in both domains. Experiments on toy and challenging real visual adaptation examples show the interest of the method, that consistently outperforms state of the art approaches. In addition, numerical experiments show that our approach leads to better performances on domain invariant deep learning features and can be easily adapted to the semi-supervised case where few labeled samples are available in the target domain.
Neuroimagery findings have shown similar cerebral networks associated with imagination and execution of a movement. On the other hand, neuropsychological studies of parietal-lesioned patients suggest that these networks may be at least partly distinct. In the present study, normal subjects were asked to either imagine or execute auditory-cued hand movements. Compared with rest, imagination and execution showed overlapping networks, including bilateral premotor and parietal areas, basal ganglia and cerebellum. However, direct comparison between the two experimental conditions showed that specific cortico-subcortical areas were more engaged in mental simulation, including bilateral premotor, prefrontal, supplementary motor and left posterior parietal areas, and the caudate nuclei. These results suggest that a specific neuronal substrate is involved in the processing of hand motor representations.
Multi-modality is widely used in medical imaging, because it can provide multiinformation about a target (tumor, organ or tissue). Segmentation using multimodality consists of fusing multi-information to improve the segmentation. Recently, deep learning-based approaches have presented the state-of-the-art performance in image classification, segmentation, object detection and tracking tasks. Due to their self-learning and generalization ability over large amounts of data, deep learning recently has also gained great interest in multi-modal medical image segmentation. In this paper, we give an overview of deep learning-based approaches for multi-modal medical image segmentation task. Firstly, we introduce the general principle of deep learning and multi-modal medical image segmentation. Secondly, we present different deep learning network architectures, then analyze their fusion strategies and compare their results. The earlier fusion is commonly used, since it’s simple and it focuses on the subsequent segmentation network architecture. However, the later fusion gives more attention on fusion strategy to learn the complex relationship between different modalities. In general, compared to the earlier fusion, the later fusion can give more accurate result if the fusion method is effective enough. We also discuss some common problems in medical image segmentation. Finally, we summarize and provide some perspectives on the future research.
The book is intended for lectures on string processes and pattern matching in Master's courses of computer science and software engineering curricula. The details of algorithms are given with correctness proofs and complexity analysis, which make them ready to implement. Algorithms are described in a C-like language. The book is also a reference for students in computational linguistics or computational biology. It presents examples of questions related to the automatic processing of natural language, to the analysis of molecular sequences, and to the management of textual databases.
Brain-computer interface P300 speller aims at helping patients unable to activate muscles to spell words by means of their brain signal activities. Associated to this BCI paradigm, there is the problem of classifying electroencephalogram signals related to responses to some visual stimuli. This paper addresses the problem of signal responses variability within a single subject in such brain-computer interface. We propose a method that copes with such variabilities through an ensemble of classifiers approach. Each classifier is composed of a linear support vector machine trained on a small part of the available data and for which a channel selection procedure has been performed. Performances of our algorithm have been evaluated on dataset II of the BCI Competition III and has yielded the best performance of the competition.
Immunotherapy is becoming a standard of care for many cancers. Immune-checkpoint inhibitors (ICI) can generate immune-related adverse events. Interstitial lung disease (ILD) has been identified as a rare but potentially severe event. Between December 2015 and April 2016, we conducted a retrospective study in centres experienced in ICI use. We report the main features of ICI–ILD with a focus on clinical presentation, radiological patterns and therapeutic strategies. We identified 64 (3.5%) out of 1826 cancer patients with ICI–ILD. Patients mainly received programmed cell death-1 inhibitors. ILD usually occurred in males, and former or current smokers, with a median age of 59 years. We observed 65.6% grade 2/3 severity, 9.4% grade 4 severity and 9.4% fatal ILD. The median (range) time from initiation of immunotherapy to ILD was 2.3 (0.2−27.4) months. Onset tended to occur earlier in lung cancer versus melanoma: median 2.1 and 5.2 months, respectively (p=0.02). Ground-glass opacities (81.3%) were the predominant lesions, followed by consolidations (53.1%). Organising pneumonia (23.4%) and hypersensitivity pneumonitis (15.6%) were the most common patterns. Overall survival at 6 months was 58.1% (95% CI 37.7–73.8%). ICI–ILD often occurs early and displays suggestive radiological features. As there is no clearly identified risk factor, oncologists need to diagnose and adequately treat this adverse event.
Motion capture setups are used in numerous fields. Studies based on motion capture data can be found in biomechanical, sport or animal science. Clinical science studies include gait analysis as well as balance, posture and motor control. Robotic applications encompass object tracking. Today's life applications includes entertainment or augmented reality. Still, few studies investigate the positioning performance of motion capture setups. In this paper, we study the positioning performance of one player in the optoelectronic motion capture based on markers: Vicon system. Our protocol includes evaluations of static and dynamic performances. Mean error as well as positioning variabilities are studied with calibrated ground truth setups that are not based on other motion capture modalities. We introduce a new setup that enables directly estimating the absolute positioning accuracy for dynamic experiments contrary to state-of-the art works that rely on inter-marker distances. The system performs well on static experiments with a mean absolute error of 0.15 mm and a variability lower than 0.025 mm. Our dynamic experiments were carried out at speeds found in real applications. Our work suggests that the system error is less than 2 mm. We also found that marker size and Vicon sampling rate must be carefully chosen with respect to the speed encountered in the application in order to reach optimal positioning performance that can go to 0.3 mm for our dynamic study.
Although promising from numerous applications, current brain-computer interfaces (BCIs) still suffer from a number of limitations. In particular, they are sensitive to noise, outliers and the non-stationarity of electroencephalographic (EEG) signals, they require long calibration times and are not reliable. Thus, new approaches and tools, notably at the EEG signal processing and classification level, are necessary to address these limitations. Riemannian approaches, spearheaded by the use of covariance matrices, are such a very promising tool slowly adopted by a growing number of researchers. This article, after a quick introduction to Riemannian geometry and a presentation of the BCI-relevant manifolds, reviews how these approaches have been used for EEG-based BCI, in particular for feature representation and learning, classifier design and calibration time reduction. Finally, relevant challenges and promising research directions for EEG signal classification in BCIs are identified, such as feature tracking on manifold or multi-task learning.
Abstract Probabilistic inversion within a multiple‐point statistics framework is often computationally prohibitive for high‐dimensional problems. To partly address this, we introduce and evaluate a new training‐image based inversion approach for complex geologic media. Our approach relies on a deep neural network of the generative adversarial network (GAN) type. After training using a training image (TI), our proposed spatial GAN (SGAN) can quickly generate 2‐D and 3‐D unconditional realizations. A key characteristic of our SGAN is that it defines a (very) low‐dimensional parameterization, thereby allowing for efficient probabilistic inversion using state‐of‐the‐art Markov chain Monte Carlo (MCMC) methods. In addition, available direct conditioning data can be incorporated within the inversion. Several 2‐D and 3‐D categorical TIs are first used to analyze the performance of our SGAN for unconditional geostatistical simulation. Training our deep network can take several hours. After training, realizations containing a few millions of pixels/voxels can be produced in a matter of seconds. This makes it especially useful for simulating many thousands of realizations (e.g., for MCMC inversion) as the relative cost of the training per realization diminishes with the considered number of realizations. Synthetic inversion case studies involving 2‐D steady state flow and 3‐D transient hydraulic tomography with and without direct conditioning data are used to illustrate the effectiveness of our proposed SGAN‐based inversion. For the 2‐D case, the inversion rapidly explores the posterior model distribution. For the 3‐D case, the inversion recovers model realizations that fit the data close to the target level and visually resemble the true model well.
BACKGROUND: In searches for clinical trials and systematic reviews, it is said that Google Scholar (GS) should never be used in isolation, but in addition to PubMed, Cochrane, and other trusted sources of information. We therefore performed a study to assess the coverage of GS specifically for the studies included in systematic reviews and evaluate if GS was sensitive enough to be used alone for systematic reviews. METHODS: All the original studies included in 29 systematic reviews published in the Cochrane Database Syst Rev or in the JAMA in 2009 were gathered in a gold standard database. GS was searched for all these studies one by one to assess the percentage of studies which could have been identified by searching only GS. RESULTS: All the 738 original studies included in the gold standard database were retrieved in GS (100%). CONCLUSION: The coverage of GS for the studies included in the systematic reviews is 100%. If the authors of the 29 systematic reviews had used only GS, no reference would have been missed. With some improvement in the research options, to increase its precision, GS could become the leading bibliographic database in medicine and could be used alone for systematic reviews.
Breast cancer (BC) is a deadly disease, killing millions of people every year. Developing automated malignant BC detection system applied on patient's imagery can help dealing with this problem more efficiently, making diagnosis more scalable and less prone to errors. Not less importantly, such kind of research can be extended to other types of cancer, making even more impact to help saving lives. Recent results on BC recognition show that Convolution Neural Networks (CNN) can achieve higher recognition rates than hand-crafted feature descriptors, but the price to pay is an increase in complexity to develop the system, requiring longer training time and specific expertise to fine-tune the architecture of the CNN. DeCAF (or deep) features consist of an in-between solution it is based on reusing a previously trained CNN only as feature vectors, which is then used as input for a classifier trained only for the new classification task. In the light of this, we present an evaluation of DeCaf features for BC recognition, in order to better understand how they compare to the other approaches. The experimental evaluation shows that these features can be a viable alternative to fast development of high-accuracy BC recognition systems, generally achieving better results than traditional hand-crafted textural descriptors and outperforming task-specific CNNs in some cases.
This paper considers the problem of recovering a sparse signal representation according to a signal dictionary. This problem could be formalized as a penalized least-squares problem in which sparsity is usually induced by a lscr <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm penalty on the coefficients. Such an approach known as the Lasso or Basis Pursuit Denoising has been shown to perform reasonably well in some situations. However, it was also proved that nonconvex penalties like the pseudo lscr <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</sub> -norm with q < 1 or smoothly clipped absolute deviation (SCAD) penalty are able to recover sparsity in a more efficient way than the Lasso. Several algorithms have been proposed for solving the resulting nonconvex least-squares problem. This paper proposes a generic algorithm to address such a sparsity recovery problem for some class of nonconvex penalties. Our main contribution is that the proposed methodology is based on an iterative algorithm which solves at each iteration a convex weighted Lasso problem. It relies on the family of nonconvex penalties which can be decomposed as a difference of convex functions (DC). This allows us to apply DC programming which is a generic and principled way for solving nonsmooth and nonconvex optimization problem. We also show that several algorithms in the literature dealing with nonconvex penalties are particular instances of our algorithm. Experimental results demonstrate the effectiveness of the proposed generic framework compared to existing algorithms, including iterative reweighted least-squares methods.
Visual surveillance of dynamic objects, particularly vehicles on the road, has been, over the past decade, an active research topic in computer vision and intelligent transportation systems communities. In the context of traffic monitoring, important advances have been achieved in environment modeling, vehicle detection, tracking, and behavior analysis. This paper is a survey that addresses particularly the issues related to vehicle monitoring with cameras at road intersections. In fact, the latter has variable architectures and represents a critical area in traffic. Accidents at intersections are extremely dangerous, and most of them are caused by drivers' errors. Several projects have been carried out to enhance the safety of drivers in the special context of intersections. In this paper, we provide an overview of vehicle perception systems at road intersections and representative related data sets. The reader is then given an introductory overview of general vision-based vehicle monitoring approaches. Subsequently and above all, we present a review of studies related to vehicle detection and tracking in intersection-like scenarios. Regarding intersection monitoring, we distinguish and compare roadside (pole-mounted, stationary) and in-vehicle (mobile platforms) systems. Then, we focus on camera-based roadside monitoring systems, with special attention to omnidirectional setups. Finally, we present possible research directions that are likely to improve the performance of vehicle detection and tracking at intersections.
Deep learning has become a popular tool for medical image analysis, but the limited availability of training data remains a major challenge, particularly in the medical field where data acquisition can be costly and subject to privacy regulations. Data augmentation techniques offer a solution by artificially increasing the number of training samples, but these techniques often produce limited and unconvincing results. To address this issue, a growing number of studies have proposed the use of deep generative models to generate more realistic and diverse data that conform to the true distribution of the data. In this review, we focus on three types of deep generative models for medical image augmentation: variational autoencoders, generative adversarial networks, and diffusion models. We provide an overview of the current state of the art in each of these models and discuss their potential for use in different downstream tasks in medical imaging, including classification, segmentation, and cross-modal translation. We also evaluate the strengths and limitations of each model and suggest directions for future research in this field. Our goal is to provide a comprehensive review about the use of deep generative models for medical image augmentation and to highlight the potential of these models for improving the performance of deep learning algorithms in medical image analysis.
We leveraged the largely untapped resource of electronic health record data to address critical clinical and epidemiological questions about Coronavirus Disease 2019 (COVID-19). To do this, we formed an international consortium (4CE) of 96 hospitals across five countries (www.covidclinical.net). Contributors utilized the Informatics for Integrating Biology and the Bedside (i2b2) or Observational Medical Outcomes Partnership (OMOP) platforms to map to a common data model. The group focused on temporal changes in key laboratory test values. Harmonized data were analyzed locally and converted to a shared aggregate form for rapid analysis and visualization of regional differences and global commonalities. Data covered 27,584 COVID-19 cases with 187,802 laboratory tests. Case counts and laboratory trajectories were concordant with existing literature. Laboratory tests at the time of diagnosis showed hospital-level differences equivalent to country-level variation across the consortium partners. Despite the limitations of decentralized data generation, we established a framework to capture the trajectory of COVID-19 disease in patients and their response to interventions.