Mathématiques et Informatique pour la Complexité et les Systèmes
facilityGif-sur-Yvette, France
Research output, citation impact, and the most-cited recent papers from Mathématiques et Informatique pour la Complexité et les Systèmes. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Mathématiques et Informatique pour la Complexité et les Systèmes
The mechanisms of action of and resistance to trastuzumab deruxtecan (T-DXd), an anti-HER2-drug conjugate for breast cancer treatment, remain unclear. The phase 2 DAISY trial evaluated the efficacy of T-DXd in patients with HER2-overexpressing (n = 72, cohort 1), HER2-low (n = 74, cohort 2) and HER2 non-expressing (n = 40, cohort 3) metastatic breast cancer. In the full analysis set population (n = 177), the confirmed objective response rate (primary endpoint) was 70.6% (95% confidence interval (CI) 58.3-81) in cohort 1, 37.5% (95% CI 26.4-49.7) in cohort 2 and 29.7% (95% CI 15.9-47) in cohort 3. The primary endpoint was met in cohorts 1 and 2. Secondary endpoints included safety. No new safety signals were observed. During treatment, HER2-expressing tumors (n = 4) presented strong T-DXd staining. Conversely, HER2 immunohistochemistry 0 samples (n = 3) presented no or very few T-DXd staining (Pearson correlation coefficient r = 0.75, P = 0.053). Among patients with HER2 immunohistochemistry 0 metastatic breast cancer, 5 of 14 (35.7%, 95% CI 12.8-64.9) with ERBB2 expression below the median presented a confirmed objective response as compared to 3 of 10 (30%, 95% CI 6.7-65.2) with ERBB2 expression above the median. Although HER2 expression is a determinant of T-DXd efficacy, our study suggests that additional mechanisms may also be involved. (ClinicalTrials.gov identifier NCT04132960 .).
Deep neural networks demonstrated their ability to provide remarkable performances on a wide range of supervised learning tasks (e.g., image classification) when trained on extensive collections of labeled data (e.g., ImageNet). However, creating such large datasets requires a considerable amount of resources, time, and effort. Such resources may not be available in many practical cases, limiting the adoption and the application of many deep learning methods. In a search for more data-efficient deep learning methods to overcome the need for large annotated datasets, there is a rising research interest in semi-supervised learning and its applications to deep neural networks to reduce the amount of labeled data required, by either developing novel methods or adopting existing semi-supervised learning frameworks for a deep learning setting. In this paper, we provide a comprehensive overview of deep semi-supervised learning, starting with an introduction to the field, followed by a summarization of the dominant semi-supervised approaches in deep learning.
The FEAST method for solving large sparse eigenproblems is equivalent to subspace iteration with an approximate spectral projector and implicit orthogonalization. This relation allows to characterize the convergence of this method in terms of the error of a certain rational approximant to an indicator function. We propose improved rational approximants leading to FEAST variants with faster convergence, in particular, when using rational approximants based on the work of Zolotarev. Numerical experiments demonstrate the possible computational savings especially for pencils whose eigenvalues are not well separated and when the dimension of the search space is only slightly larger than the number of wanted eigenvalues. The new approach improves both convergence robustness and load balancing when FEAST runs on multiple search intervals in parallel.
The availability of patient cohorts with several types of omics data opens new perspectives for exploring the disease's underlying biological processes and developing predictive models. It also comes with new challenges in computational biology in terms of integrating high-dimensional and heterogeneous data in a fashion that captures the interrelationships between multiple genes and their functions. Deep learning methods offer promising perspectives for integrating multi-omics data. In this paper, we review the existing integration strategies based on autoencoders and propose a new customizable one whose principle relies on a two-phase approach. In the first phase, we adapt the training to each data source independently before learning cross-modality interactions in the second phase. By taking into account each source's singularity, we show that this approach succeeds at taking advantage of all the sources more efficiently than other strategies. Moreover, by adapting our architecture to the computation of Shapley additive explanations, our model can provide interpretable results in a multi-source setting. Using multiple omics sources from different TCGA cohorts, we demonstrate the performance of the proposed method for cancer on test cases for several tasks, such as the classification of tumor types and breast cancer subtypes, as well as survival outcome prediction. We show through our experiments the great performances of our architecture on seven different datasets with various sizes and provide some interpretations of the results obtained. Our code is available on (https://github.com/HakimBenkirane/CustOmics).
Recent deep generative models are able to provide photo-realistic images as well as visual or textual content embeddings useful to address various tasks of computer vision and natural language processing. Their usefulness is nevertheless often limited by the lack of control over the generative process or the poor understanding of the learned representation. To overcome these major issues, very recent work has shown the interest of studying the semantics of the latent space of generative models. In this paper, we propose to advance on the interpretability of the latent space of generative models by introducing a new method to find meaningful directions in the latent space of any generative model along which we can move to control precisely specific properties of the generated image like the position or scale of the object in the image. Our method does not require human annotations and is particularly well suited for the search of directions encoding simple transformations of the generated image, such as translation, zoom or color variations. We demonstrate the effectiveness of our method qualitatively and quantitatively, both for GANs and variational auto-encoders.
A limit order book is essentially a file on a computer that contains all orders sent to the market, along with their characteristics such as the sign of the order, price, quantity and a timestamp. The majority of organized electronic markets rely on limit order books to store the list of interests of market participants on their central computer. A limit order book contains all the information available on a specific market and it reflects the way the market moves under the influence of its participants. This book discusses several models of limit order books. It begins by discussing the data to assess their empirical properties, and then moves on to mathematical models in order to reproduce the observed properties. Finally, the book presents a framework for numerical simulations. It also covers important modelling techniques including agent-based modelling, and advanced modelling of limit order books based on Hawkes processes. The book also provides in-depth coverage of simulation techniques and introduces general, flexible, open source library concepts useful to readers studying trading strategies in order-driven markets.
Active learning increases the effectiveness of labeling when only subsets of unlabeled datasets can be processed manually. To our knowledge, existing algorithms are designed under the assumption that datasets are balanced. However, many real-life datasets are actually imbalanced and we propose two adaptations of active learning to tackle imbalance. First, we modify acquisition functions to select samples by taking advantage of a deep model pretrained on a source domain. Second, we introduce a balancing step in the acquisition process to reduce the imbalance of the labeled subset. Evaluation is done with four imbalanced datasets using existing active learning methods and their modifications introduced here. Results show that our adaptations are useful as long as knowledge from the source domain is transferable to target domains.
In this study, we propose a 3D deep neural network called U-ReSNet, a joint framework that can accurately register and segment medical volumes. The proposed network learns to automatically generate linear and elastic deformation models, trained by minimizing the mean square error and the local cross correlation similarity metrics. In parallel, a coupled architecture is integrated, seeking to provide segmentation maps for anatomies or tissue patterns using an additional decoder part trained with the dice coefficient metric. U-ReSNet is trained in an end to end fashion, while due to this joint optimization the generated network features are more informative leading to promising results compared to other deep learning-based methods existing in the literature. We evaluated the proposed architecture using the publicly available OASIS 3 dataset, measuring the dice coefficient metric for both registration and segmentation tasks. Our promising results indicate the potentials of our method which is composed from a convolutional architecture that is extremely simple and light in terms of parameters.
BACKGROUND: Bariatric surgery is an effective therapeutic procedure for morbidly obese patients. The 2 most common interventions are sleeve gastrectomy (SG) and laparoscopic Roux-en-Y gastric bypass (LRYGB). OBJECTIVES: The aim of this study was to compare microbiome long-term microbiome after SG and LRYGB surgery in obese patients. SETTING: University Hospital, France; University Hospital, United States; and University Hospital, Switzerland. METHODS: Eighty-nine and 108 patients who underwent SG and LRYGB, respectively, were recruited. Stools were collected before and 6 months after surgery. Microbial DNA was analyzed with shotgun metagenomic sequencing (SOLiD 5500 xl Wildfire). MSPminer, a novel innovative tool to characterize new in silico biological entities, was used to identify 715 Metagenomic Species Pan-genome. One hundred forty-eight functional modules were analyzed using GOmixer and KEGG database. RESULTS: Both interventions resulted in a similar increase of Shannon's diversity index and gene richness of gut microbiota, in parallel with weight loss, but the changes of microbial composition were different. LRYGB led to higher relative abundance of aero-tolerant bacteria, such as Escherichia coli and buccal species, such as Streptococcus and Veillonella spp. In contrast, anaerobes, such as Clostridium, were more abundant after SG, suggesting better conservation of anaerobic conditions in the gut. Enrichment of Akkermansia muciniphila was also observed after both surgeries. Function-level changes included higher potential for bacterial use of supplements, such as vitamin B12, B1, and iron upon LRYGB. CONCLUSION: Microbiota changes after bariatric surgery depend on the nature of the intervention. LRYGB induces greater taxonomic and functional changes in gut microbiota than SG. Possible long-term health consequences of these alterations remain to be established.
Image registration and segmentation are the two most studied problems in medical image analysis. Deep learning algorithms have recently gained a lot of attention due to their success and state of the art results in varieties of problems and communities. In this paper, we propose a novel, efficient, and multi-task algorithm that addresses the problems of image registration and brain tumor segmentation jointly. Our method exploits the dependencies between these tasks through a natural coupling of their interdependencies during inference. In particular, constraints in correspondences are relaxed within the registration objective function in the regions of tumors, that are automatically recovered resulting in tumor volume preservation. We evaluated the performance of our formulation both quantitatively and qualitatively for registration and segmentation problems on two publicly available datasets (BraTS 2018 and OASIS 3), reporting competitive results with other recent state of the art methods.
International audience
Abstract Deep learning belongs to the broader family of machine learning methods and currently provides state-of-the-art performance in a variety of fields, including medical applications. Deep learning architectures can be categorized into different groups depending on their components. However, most of them share similar modules and mathematical formulations. In this chapter, the basic concepts of deep learning will be presented to provide a better understanding of these powerful and broadly used algorithms. The analysis is structured around the main components of deep learning architectures, focusing on convolutional neural networks and autoencoders.
Background Longitudinal follow-up of interstitial lung diseases (ILDs) at CT mainly relies on the evaluation of the extent of ILD, without accounting for lung shrinkage. Purpose To develop a deep learning–based method to depict worsening of ILD based on lung shrinkage detection from elastic registration of chest CT scans in patients with systemic sclerosis (SSc). Materials and Methods Patients with SSc evaluated between January 2009 and October 2017 who had undergone at least two unenhanced supine CT scans of the chest and pulmonary function tests (PFTs) performed within 3 months were retrospectively included. Morphologic changes on CT scans were visually assessed by two observers and categorized as showing improvement, stability, or worsening of ILD. Elastic registration between baseline and follow-up CT images was performed to obtain deformation maps of the whole lung. Jacobian determinants calculated from the deformation maps were given as input to a deep learning–based classifier to depict morphologic and functional worsening. For this purpose, the set was randomly split into training, validation, and test sets. Correlations between mean Jacobian values and changes in PFT measurements were evaluated with the Spearman correlation. Results A total of 212 patients (median age, 53 years; interquartile range, 45–62 years; 177 women) were included as follows: 138 for the training set (65%), 34 for the validation set (16%), and 40 for the test set (21%). Jacobian maps demonstrated lung parenchyma shrinkage of the posterior lung bases in patients found to have worsened ILD at visual assessment. The classifier detected morphologic and functional worsening with an accuracy of 80% (32 of 40 patients; 95% confidence interval [CI]: 64%, 91%) and 83% (33 of 40 patients; 95% CI: 67%, 93%), respectively. Jacobian values correlated with changes in forced vital capacity (R = −0.38; 95% CI: −0.25, −0.49; P < .001) and diffusing capacity for carbon monoxide (R = −0.42; 95% CI: −0.27, −0.54; P < .001). Conclusion Elastic registration of CT scans combined with a deep learning classifier aided in the diagnosis of morphologic and functional worsening of interstitial lung disease in patients with systemic sclerosis. © RSNA, 2020 Online supplemental material is available for this article. See also the editorial by Verschakelen in this issue.
Given a point <inline-formula><tex-math notation="LaTeX">$p$</tex-math></inline-formula> and a set of points <inline-formula><tex-math notation="LaTeX">$S$</tex-math></inline-formula> , the kNN operation finds the <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> closest points to in <inline-formula><tex-math notation="LaTeX">$S$</tex-math></inline-formula> . It is a computational intensive task with a large range of applications such as knowledge discovery or data mining. However, as the volume and the dimension of data increase, only distributed approaches can perform such costly operation in a reasonable time. Recent works have focused on implementing efficient solutions using the MapReduce programming model because it is suitable for distributed large scale data processing. Although these works provide different solutions to the same problem, each one has particular constraints and properties. In this paper, we compare the different existing approaches for computing kNN on MapReduce, first theoretically, and then by performing an extensive experimental evaluation. To be able to compare solutions, we identify three generic steps for kNN computation on MapReduce: data pre-processing, data partitioning, and computation. We then analyze each step from load balancing, accuracy, and complexity aspects. Experiments in this paper use a variety of datasets, and analyze the impact of data volume, data dimension, and the value of k from many perspectives like time and space complexity, and accuracy. The experimental part brings new advantages and shortcomings that are discussed for each algorithm. To the best of our knowledge, this is the first paper that compares kNN computing methods on MapReduce both theoretically and experimentally with the same setting. Overall, this paper can be used as a guide to tackle kNN-based practical problems in the context of big data.
Segmentation and accurate localization of nuclei in histopathological images\nis a very challenging problem, with most existing approaches adopting a\nsupervised strategy. These methods usually rely on manual annotations that\nrequire a lot of time and effort from medical experts. In this study, we\npresent a self-supervised approach for segmentation of nuclei for whole slide\nhistopathology images. Our method works on the assumption that the size and\ntexture of nuclei can determine the magnification at which a patch is\nextracted. We show that the identification of the magnification level for tiles\ncan generate a preliminary self-supervision signal to locate nuclei. We further\nshow that by appropriately constraining our model it is possible to retrieve\nmeaningful segmentation maps as an auxiliary output to the primary\nmagnification identification task. Our experiments show that with standard\npost-processing, our method can outperform other unsupervised nuclei\nsegmentation approaches and report similar performance with supervised ones on\nthe publicly available MoNuSeg dataset. Our code and models are available\nonline to facilitate further research.\n
PURPOSE: The goal of this data challenge was to create a structured dynamic with the following objectives: (1) teach radiologists the new rules of General Data Protection Regulation (GDPR), while building a large multicentric prospective database of ultrasound, computed tomography (CT) and MRI patient images; (2) build a network including radiologists, researchers, start-ups, large companies, and students from engineering schools, and; (3) provide all French stakeholders working together during 5 data challenges with a secured framework, offering a realistic picture of the benefits and concerns in October 2018. MATERIALS AND METHODS: Relevant clinical questions were chosen by the Société Francaise de Radiologie. The challenge was designed to respect all French ethical and data protection constraints. Multidisciplinary teams with at least one radiologist, one engineering student, and a company and/or research lab were gathered using different networks, and clinical databases were created accordingly. RESULTS: Five challenges were launched: detection of meniscal tears on MRI, segmentation of renal cortex on CT, detection and characterization of liver lesions on ultrasound, detection of breast lesions on MRI, and characterization of thyroid cartilage lesions on CT. A total of 5,170 images within 4 months were provided for the challenge by 46 radiology services. Twenty-six multidisciplinary teams with 181 contestants worked for one month on the challenges. Three challenges, meniscal tears, renal cortex, and liver lesions, resulted in an accuracy>90%. The fourth challenge (breast) reached 82% and the lastone (thyroid) 70%. CONCLUSION: Theses five challenges were able to gather a large community of radiologists, engineers, researchers, and companies in a very short period of time. The accurate results of three of the five modalities suggest that artificial intelligence is a promising tool in these radiology modalities.
Volume electron microscopy is an important imaging modality in contemporary cell biology. Identification of intracellular structures is a laborious process limiting the effective use of this potentially powerful tool. We resolved this bottleneck with automated segmentation of intracellular substructures in electron microscopy (ASEM), a new pipeline to train a convolutional neural network to detect structures of a wide range in size and complexity. We obtained dedicated models for each structure based on a small number of sparsely annotated ground truth images from only one or two cells. Model generalization was improved with a rapid, computationally effective strategy to refine a trained model by including a few additional annotations. We identified mitochondria, Golgi apparatus, endoplasmic reticulum, nuclear pore complexes, caveolae, clathrin-coated pits, and vesicles imaged by focused ion beam scanning electron microscopy. We uncovered a wide range of membrane-nuclear pore diameters within a single cell and derived morphological metrics from clathrin-coated pits and vesicles, consistent with the classical constant-growth assembly model.
Background: Computed tomography angiography (CTA) is one of the most commonly used imaging technique for the management of vascular diseases. Here, we aimed to develop a hybrid method combining a feature-based expert system with a supervised deep learning (DL) algorithm to enable a fully automatic segmentation of the abdominal vascular tree. Methods: We proposed an algorithm based on the hybridization of a data-driven convolutional neural network and a knowledge-based model dedicated to vascular system segmentation. By using two distinct datasets of CTA from patients to evaluate independence to training dataset, the accuracy of the hybrid method for lumen and thrombus segmentation was evaluated compared to the feature-based expert system alone and to the ground truth provided by a human expert. Results: The hybrid approach demonstrated a better accuracy for lumen segmentation compared to the expert system alone (volume similarity: 0.8128 vs. 0.7912, p = 0.0006 and Dice similarity coefficient: 0.8266 vs. 0.7942, p < 0.0001). The accuracy for thrombus segmentation was also enhanced using the hybrid approach (volume similarity: 0.9404 vs. 0.9185, p = 0.0027 and Dice similarity coefficient: 0.8918 vs. 0.8654, p < 0.0001). Conclusions: By enabling a robust and fully automatic segmentation, the method could be used to develop real-time decision support to help in the management of vascular diseases.
Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some termination protocols were proposed for asynchronous iterations, only very few of them are based on global residual computation and guarantee effective convergence. But the most effective and efficient existing solutions feature two reduction operations, which constitutes an important factor of termination delay. In this paper, we present new, non-intrusive, protocols to compute a residual error under asynchronous iterations, requiring only one reduction operation. Various communication models show that some heuristics can even be introduced and formally evaluated. Extensive experiments with up to 5,600 processor cores confirm the practical effectiveness and efficiency of our approach.
Objectives Lumacaftor-ivacaftor is a cystic fibrosis transmembrane conductance regulator (CFTR) modulator known to improve clinical status in people with cystic fibrosis (CF). This study aimed to assess lung structural changes after one year of lumacaftor-ivacaftor treatment, and to use unsupervised machine learning to identify morphological phenotypes of lung disease that are associated with response to lumacaftor-ivacaftor. Methods Adolescents and adults with CF from the French multicenter real-world prospective observational study evaluating the first year of treatment with lumacaftor-ivacaftor were included if they had pretherapeutic and follow-up chest computed tomography (CT)-scans available. CT scans were visually scored using a modified Bhalla score. A k-mean clustering method was performed based on 120 radiomics features extracted from unenhanced pretherapeutic chest CT scans. Results A total of 283 patients were included. The Bhalla score significantly decreased after 1 year of lumacaftor-ivacaftor (−1.40±1.53 points compared with pretherapeutic CT; p<0.001). This finding was related to a significant decrease in mucus plugging (−0.35±0.62 points; p<0.001), bronchial wall thickening (−0.24±0.52 points; p<0.001) and parenchymal consolidations (−0.23±0.51 points; p<0.001). Cluster analysis identified 3 morphological clusters. Patients from cluster C were more likely to experience an increase in percent predicted forced expiratory volume in 1 sec (ppFEV 1 ) ≥5 under lumacaftor–ivacaftor than those in the other clusters (54% of responders versus 32% and 33%; p=0.01). Conclusion One year treatment with lumacaftor-ivacaftor was associated with a significant visual improvement of bronchial disease on chest CT. Radiomics features on pretherapeutic CT scan may help in predicting lung function response under lumacaftor-ivacaftor.