NobleBlocks
Polytechnique Montréal logo

Polytechnique Montréal

UniversityMontreal, Quebec, Canada

Research output, citation impact, and the most-cited recent papers from Polytechnique Montréal (Canada). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
38.5K
Citations
2.0M
h-index
392
i10-index
34.6K
Also known as
Polytechnique MontréalÉcole Polytechnique de Montréal

Top-cited papers from Polytechnique Montréal

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
Kyunghyun Cho, Bart van Merriënboer, Çağlar Gülçehre, Dzmitry Bahdanau +3 more
201424.3Kdoi:10.3115/v1/d14-1179

Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014.

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Jun‐Young Chung, Çaǧlar Gülçehre, Kyunghyun Cho, Yoshua Bengio
2014· arXiv (Cornell University)10.8Kdoi:10.48550/arxiv.1412.3555

In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

Convolutional networks for images, speech, and time series
Yann LeCun, Yoshua Bengio
1998· HAL (Le Centre pour la Communication Scientifique Directe)4.4Kdoi:10.5555/303568.303704

International audience

GWTC-1: A Gravitational-Wave Transient Catalog of Compact Binary Mergers Observed by LIGO and Virgo during the First and Second Observing Runs
B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham +4 more
2019· Physical Review X3.6Kdoi:10.1103/physrevx.9.031040

We present the results from three gravitational-wave searches for coalescing compact binaries with component masses above <a:math xmlns:a="http://www.w3.org/1998/Math/MathML" display="inline"><a:mrow><a:mn>1</a:mn><a:mtext> </a:mtext><a:mtext> </a:mtext><a:msub><a:mrow><a:mi>M</a:mi></a:mrow><a:mrow><a:mo stretchy="false">⊙</a:mo></a:mrow></a:msub></a:mrow></a:math> during the first and second observing runs of the advanced gravitational-wave detector network. During the first observing run (<d:math xmlns:d="http://www.w3.org/1998/Math/MathML" display="inline"><d:mi>O</d:mi><d:mn>1</d:mn></d:math>), from September 12, 2015 to January 19, 2016, gravitational waves from three binary black hole mergers were detected. The second observing run (<f:math xmlns:f="http://www.w3.org/1998/Math/MathML" display="inline"><f:mi>O</f:mi><f:mn>2</f:mn></f:math>), which ran from November 30, 2016 to August 25, 2017, saw the first detection of gravitational waves from a binary neutron star inspiral, in addition to the observation of gravitational waves from a total of seven binary black hole mergers, four of which we report here for the first time: GW170729, GW170809, GW170818, and GW170823. For all significant gravitational-wave events, we provide estimates of the source properties. The detected binary black holes have total masses between <h:math xmlns:h="http://www.w3.org/1998/Math/MathML" display="inline"><h:mrow><h:msubsup><h:mrow><h:mn>18.6</h:mn></h:mrow><h:mrow><h:mo>−</h:mo><h:mn>0.7</h:mn></h:mrow><h:mrow><h:mo>+</h:mo><h:mn>3.2</h:mn></h:mrow></h:msubsup><h:mtext> </h:mtext><h:mtext> </h:mtext><h:msub><h:mrow><h:mi>M</h:mi></h:mrow><h:mrow><h:mo stretchy="false">⊙</h:mo></h:mrow></h:msub></h:mrow></h:math> and <k:math xmlns:k="http://www.w3.org/1998/Math/MathML" display="inline"><k:msubsup><k:mn>84.4</k:mn><k:mrow><k:mo>−</k:mo><k:mn>11.1</k:mn></k:mrow><k:mrow><k:mo>+</k:mo><k:mn>15.8</k:mn></k:mrow></k:msubsup><k:mtext> </k:mtext><k:mtext> </k:mtext><k:msub><k:mrow><k:mi>M</k:mi></k:mrow><k:mrow><k:mo stretchy="false">⊙</k:mo></k:mrow></k:msub></k:math> and range in distance between <n:math xmlns:n="http://www.w3.org/1998/Math/MathML" display="inline"><n:msubsup><n:mn>320</n:mn><n:mrow><n:mo>−</n:mo><n:mn>110</n:mn></n:mrow><n:mrow><n:mo>+</n:mo><n:mn>120</n:mn></n:mrow></n:msubsup></n:math> and <p:math xmlns:p="http://www.w3.org/1998/Math/MathML" display="inline"><p:mrow><p:msubsup><p:mrow><p:mn>2840</p:mn></p:mrow><p:mrow><p:mo>−</p:mo><p:mn>1360</p:mn></p:mrow><p:mrow><p:mo>+</p:mo><p:mn>1400</p:mn></p:mrow></p:msubsup><p:mtext> </p:mtext><p:mtext> </p:mtext><p:mi>Mpc</p:mi></p:mrow></p:math>. No neutron star–black hole mergers were detected. In addition to highly significant gravitational-wave events, we also provide a list of marginal event candidates with an estimated false-alarm rate less than 1 per 30 days. From these results over the first two observing runs, which include approximately one gravitational-wave detection per 15 days of data searched, we infer merger rates at the 90% confidence intervals of <r:math xmlns:r="http://www.w3.org/1998/Math/MathML" display="inline"><r:mrow><r:mn>110</r:mn><r:mo>−</r:mo><r:mn>3840</r:mn><r:mtext> </r:mtext><r:mtext> </r:mtext><r:msup><r:mrow><r:mi>Gpc</r:mi></r:mrow><r:mrow><r:mo>−</r:mo><r:mn>3</r:mn></r:mrow></r:msup><r:mtext> </r:mtext><r:msup><r:mrow><r:mi mathvariant="normal">y</r:mi></r:mrow><r:mrow><r:mo>−</r:mo><r:mn>1</r:mn></r:mrow></r:msup></r:mrow></r:math> for binary neutron stars and <u:math xmlns:u="http://www.w3.org/1998/Math/MathML" display="inline"><u:mrow><u:mn>9.7</u:mn><u:mo>−</u:mo><u:mn>101</u:mn><u:mtext> </u:mtext><u:mtext> </u:mtext><u:msup><u:mrow><u:mi>Gpc</u:mi></u:mrow><u:mrow><u:mo>−</u:mo><u:mn>3</u:mn></u:mrow></u:msup><u:mtext> </u:mtext><u:msup><u:mrow><u:mi mathvariant="normal">y</u:mi></u:mrow><u:mrow><u:mo>−</u:mo><u:mn>1</u:mn></u:mrow></u:msup></u:mrow></u:math> for binary black holes assuming fixed population distributions and determine a neutron star–black hole merger rate 90% upper limit of <x:math xmlns:x="http://www.w3.org/1998/Math/MathML" display="inline"><x:mrow><x:mn>610</x:mn><x:mtext> </x:mtext><x:mtext> </x:mtext><x:msup><x:mrow><x:mi>Gpc</x:mi></x:mrow><x:mrow><x:mo>−</x:mo><x:mn>3</x:mn></x:mrow></x:msup><x:mtext> </x:mtext><x:msup><x:mrow><x:mi mathvariant="normal">y</x:mi></x:mrow><x:mrow><x:mo>−</x:mo><x:mn>1</x:mn></x:mrow></x:msup></x:mrow></x:math>. Published by the American Physical Society 2019

Accumulation of Deficits as a Proxy Measure of Aging
Arnold B. Mitnitski, Alexander Mogilner, Kenneth Rockwood
2001· The Scientific World JOURNAL2.9Kdoi:10.1100/tsw.2001.58

This paper develops a method for appraising health status in elderly people. A frailty index was defined as the proportion of accumulated deficits (symptoms, signs, functional impairments, and laboratory abnormalities). It serves as an individual state variable, reflecting severity of illness and proximity to death. In a representative database of elderly Canadians we found that deficits accumulated at 3% per year, and show a gamma distribution, typical for systems with redundant components that can be used in case of failure of a given subsystem. Of note, the slope of the index is insensitive to the individual nature of the deficits, and serves as an important prognostic factor for life expectancy. The formula for estimating an individual's life span given the frailty index value is presented. For different patterns of cognitive impairments the average within-group index value increases with the severity of the cognitive impairment, and the relative variability of the index is significantly reduced. Finally, the statistical distribution of the frailty index sharply differs between well groups (gamma distribution) and morbid groups (normal distribution). This pattern reflects an increase in uncompensated deficits in impaired organisms, which would lead to illness of various etiologies, and ultimately to increased mortality. The accumulation of deficits is as an example of a macroscopic variable, i.e., one that reflects general properties of aging at the level of the whole organism rather than any given functional deficiency. In consequence, we propose that it may be used as a proxy measure of aging.

Online and off-line handwriting recognition: a comprehensive survey
Réjean Plamondon, Sargur N. Srihari
2000· IEEE Transactions on Pattern Analysis and Machine Intelligence2.5Kdoi:10.1109/34.824821

Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

FitNets: Hints for Thin Deep Nets
Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang +2 more
2014· arXiv (Cornell University)2.0Kdoi:10.48550/arxiv.1412.6550

While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.

GWTC-2: Compact Binary Coalescences Observed by LIGO and Virgo during the First Half of the Third Observing Run
R. Abbott, T. D. Abbott, S. Abraham, F. Acernese +4 more
2021· Physical Review X2.0Kdoi:10.1103/physrevx.11.021053

We report on gravitational-wave discoveries from compact binary coalescences detected by Advanced LIGO and Advanced Virgo in the first half of the third observing run (O3a) between 1 April 2019 <a:math xmlns:a="http://www.w3.org/1998/Math/MathML" display="inline"><a:mrow><a:mn>15</a:mn><a:mo>∶</a:mo><a:mn>00</a:mn></a:mrow></a:math> UTC and 1 October 2019 <c:math xmlns:c="http://www.w3.org/1998/Math/MathML" display="inline"><c:mrow><c:mn>15</c:mn><c:mo>∶</c:mo><c:mn>00</c:mn></c:mrow></c:math> UTC. By imposing a false-alarm-rate threshold of two per year in each of the four search pipelines that constitute our search, we present 39 candidate gravitational-wave events. At this threshold, we expect a contamination fraction of less than 10%. Of these, 26 candidate events were reported previously in near-real time through gamma-ray coordinates network notices and circulars; 13 are reported here for the first time. The catalog contains events whose sources are black hole binary mergers up to a redshift of approximately 0.8, as well as events whose components cannot be unambiguously identified as black holes or neutron stars. For the latter group, we are unable to determine the nature based on estimates of the component masses and spins from gravitational-wave data alone. The range of candidate event masses which are unambiguously identified as binary black holes (both objects <e:math xmlns:e="http://www.w3.org/1998/Math/MathML" display="inline"><e:mo>≥</e:mo><e:mn>3</e:mn><e:mtext> </e:mtext><e:mtext> </e:mtext><e:msub><e:mi>M</e:mi><e:mo stretchy="false">⊙</e:mo></e:msub></e:math>) is increased compared to GWTC-1, with total masses from approximately <h:math xmlns:h="http://www.w3.org/1998/Math/MathML" display="inline"><h:mn>14</h:mn><h:mtext> </h:mtext><h:mtext> </h:mtext><h:msub><h:mi>M</h:mi><h:mo stretchy="false">⊙</h:mo></h:msub></h:math> for GW190924_021846 to approximately <k:math xmlns:k="http://www.w3.org/1998/Math/MathML" display="inline"><k:mn>150</k:mn><k:mtext> </k:mtext><k:mtext> </k:mtext><k:msub><k:mi>M</k:mi><k:mo stretchy="false">⊙</k:mo></k:msub></k:math> for GW190521. For the first time, this catalog includes binary systems with significantly asymmetric mass ratios, which had not been observed in data taken before April 2019. We also find that 11 of the 39 events detected since April 2019 have positive effective inspiral spins under our default prior (at 90% credibility), while none exhibit negative effective inspiral spin. Given the increased sensitivity of Advanced LIGO and Advanced Virgo, the detection of 39 candidate events in approximately 26 weeks of data (approximately 1.5 per week) is consistent with GWTC-1. Published by the American Physical Society 2021

FactSage thermochemical software and databases, 2010–2016
C. W. Bale, Ève Bélisle, Patrice Chartrand, Sergei A. Decterov +4 more
2016· Calphad1.9Kdoi:10.1016/j.calphad.2016.05.002

The FactSage computer package consists of a series of information, calculation and manipulation modules that enable one to access and manipulate compound and solution databases. With the various modules running under Microsoft Windows® one can perform a wide variety of thermochemical calculations and generate tables, graphs and figures of interest to chemical and physical metallurgists, chemical engineers, corrosion engineers, inorganic chemists, geochemists, ceramists, electrochemists, environmentalists, etc. This paper presents a summary of the developments in the FactSage thermochemical software and databases during the last six years. Particular emphasis is placed on the new databases and developments in calculating and manipulating phase diagrams.

BinaryConnect: Training Deep Neural Networks with binary weights during propagations
Matthieu Courbariaux, Yoshua Bengio, Jean‐Pierre David
2015· PolyPublie (École Polytechnique de Montréal)1.8Kdoi:10.48550/arxiv.1511.00363

Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation
Simon Jégou, Michal Drozdzal, David Vázquez, Adriana Romero +1 more
20171.7Kdoi:10.1109/cvprw.2017.156

State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions. Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train. In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.

Camera calibration with distortion models and accuracy evaluation
Juyang Weng, Paul R. Cohen, M. Herniou
1992· IEEE Transactions on Pattern Analysis and Machine Intelligence1.7Kdoi:10.1109/34.159901

A camera model that accounts for major sources of camera distortion, namely, radial, decentering, and thin prism distortions is presented. The proposed calibration procedure consists of two steps: (1) the calibration parameters are estimated using a closed-form solution based on a distribution-free camera model; and (2) the parameters estimated in the first step are improved iteratively through a nonlinear optimization, taking into account camera distortions. According to minimum variance estimation, the objective function to be minimized is the mean-square discrepancy between the observed image points and their inferred image projections computed with the estimated calibration parameters. The authors introduce a type of measure that can be used to directly evaluate the performance of calibration and compare calibrations among different systems. The validity and performance of the calibration procedure are tested with both synthetic data and real images taken by tele- and wide-angle lenses.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

Guided-wave and leakage characteristics of substrate integrated waveguide
Feng Xu, Ke Wu
2005· IEEE Transactions on Microwave Theory and Techniques1.7Kdoi:10.1109/tmtt.2004.839303

The substrate integrated waveguide (SIW) technique makes it possible that a complete circuit including planar circuitry, transitions, and rectangular waveguides are fabricated in planar form using a standard printed circuit board or other planar processing techniques. In this paper, guided wave and modes characteristics of such an SIW periodic structure are studied in detail for the first time. A numerical multimode calibration procedure is proposed and developed with a commercial software package on the basis of a full-wave finite-element method for the accurate extraction of complex propagation constants of the SIW structure. Two different lengths of the SIW are numerically simulated under multimode excitation. By means of our proposed technique, the complex propagation constant of each SIW mode can accurately be extracted and the electromagnetic bandstop phenomena of periodic structures are also investigated. Experiments are made to validate our proposed technique. Simple design rules are provided and discussed.

Integrated microstrip and rectangular waveguide in planar form
Dominic Deslandes, Ke Wu
2001· IEEE Microwave and Wireless Components Letters1.7Kdoi:10.1109/7260.914305

Usually transitions from microstrip line to rectangular waveguide are made with three-dimensional complex mounting structures. In this paper, a new planar platform is developed in which the microstrip line and rectangular waveguide are fully integrated on the same substrate, and they are interconnected via a simple taper. Our experiments at 28 GHz show that an effective bandwidth of 12% at 20 dB return loss is obtained with an in-band insertion loss better than 0.3 dB. The new transition allows a complete integration of waveguide components on substrate with MICs and MMICs.

Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle
Peter E. Caines, Minyi Huang, Roland P. Malhamé
2006· Communications in Information and Systems1.6Kdoi:10.4310/cis.2006.v6.n3.a5

We consider stochastic dynamic games in large population conditions where multiclass agents are weakly coupled via their individual dynamics and costs. We approach this large population game problem by the so-called Nash Certainty Equivalence (NCE) Principle which leads to a decentralized control synthesis. The McKean-Vlasov NCE method presented in this paper has a close connection with the statistical physics of large particle systems: both identify a consistency relationship between the individual agent (or particle) at the microscopic level and the mass of individuals (or particles) at the macroscopic level. The overall game is decomposed into (i) an optimal control problem whose Hamilton-Jacobi-Bellman (HJB) equation determines the optimal control for each individual and which involves a measure corresponding to the mass effect, and (ii) a family of McKean-Vlasov (M-V) equations which also depend upon this measure. We designate the NCE Principle as the property that the resulting scheme is consistent (or soluble), i.e. the prescribed control laws produce sample paths which produce the mass effect measure. By construction, the overall closed-loop behaviour is such that each agent's behaviour is optimal with respect to all other agents in the game theoretic Nash sense.

GW190425: Observation of a Compact Binary Coalescence with Total Mass ∼ 3.4 M<sub>⊙</sub>
B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham +4 more
2020· The Astrophysical Journal Letters1.6Kdoi:10.3847/2041-8213/ab75f5

Abstract On 2019 April 25, the LIGO Livingston detector observed a compact binary coalescence with signal-to-noise ratio 12.9. The Virgo detector was also taking data that did not contribute to detection due to a low signal-to-noise ratio, but were used for subsequent parameter estimation. The 90% credible intervals for the component masses range from to ( – if we restrict the dimensionless component spin magnitudes to be smaller than 0.05). These mass parameters are consistent with the individual binary components being neutron stars. However, both the source-frame chirp mass and the total mass of this system are significantly larger than those of any other known binary neutron star (BNS) system. The possibility that one or both binary components of the system are black holes cannot be ruled out from gravitational-wave data. We discuss possible origins of the system based on its inconsistency with the known Galactic BNS population. Under the assumption that the signal was produced by a BNS coalescence, the local rate of neutron star mergers is updated to 250–2810 .

\nUSEtox - The UNEP-SETAC toxicity model: recommended characterisation factors for human toxicity and freshwater ecotoxicity in life cycle impact assessment
Ralph K. Rosenbaum, Till M. Bachmann, Lois Swirsky Gold, Mark A. J. Huijbregts +4 more
2008· Radboud Repository (Radboud University)1.5Kdoi:10.1007/s11367-008-0038-4

\n Contains fulltext :\n 72404.pdf (Publisher’s version ) (Open Access)\n

Review of substrate-integrated waveguide circuits and antennas
Maurizio Bozzi, Apostolos Georgiadis, Ke Wu
2011· IET Microwaves Antennas & Propagation1.4Kdoi:10.1049/iet-map.2010.0463

Substrate-integrated waveguide (SIW) technology represents an emerging and very promising candidate for the development of circuits and components operating in the microwave and millimetre-wave region. SIW structures are generally fabricated by using two rows of conducting cylinders or slots embedded in a dielectric substrate that connects two parallel metal plates, and permit the implementation of classical rectangular waveguide components in planar form, along with printed circuitry, active devices and antennas. This study aims to provide an overview of the recent advances in the modelling, design and technological implementation of SIW structures and components.

NICE: Non-linear Independent Components Estimation
Laurent Dinh, David Krueger, Yoshua Bengio
2014· arXiv (Cornell University)1.3Kdoi:10.48550/arxiv.1410.8516

We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transformed data conform to a factorized distribution, i.e., resulting in independent latent variables. We parametrize this transformation so that computing the Jacobian determinant and inverse transform is trivial, yet we maintain the ability to learn complex non-linear transformations, via a composition of simple building blocks, each based on a deep neural network. The training criterion is simply the exact log-likelihood, which is tractable. Unbiased ancestral sampling is also easy. We show that this approach yields good generative models on four image datasets and can be used for inpainting.

Machine learning for combinatorial optimization: A methodological tour d'horizon
Yoshua Bengio, Andrea Lodi, Antoine Prouvost
2021· Archivio istituzionale della ricerca (Alma Mater Studiorum Università di Bologna)1.3Kdoi:10.1016/j.ejor.2020.07.063

This paper surveys the recent attempts, both from the machine learning and operations research communities, at leveraging machine learning to solve combinatorial optimization problems. Given the hard nature of these problems, state-of-the-art algorithms rely on handcrafted heuristics for making decisions that are otherwise too expensive to compute or mathematically not well defined. Thus, machine learning looks like a natural candidate to make such decisions in a more principled and optimized way. We advocate for pushing further the integration of machine learning and combinatorial optimization and detail a methodology to do so. A main point of the paper is seeing generic optimization problems as data points and inquiring what is the relevant distribution of problems to use for learning on a given task.