Courant Institute of Mathematical Sciences
UniversityNew York, New York, United States
Research output, citation impact, and the most-cited recent papers from Courant Institute of Mathematical Sciences (United States). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Courant Institute of Mathematical Sciences
Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000.
Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
The authors introduce an algorithm, called matching pursuit, that decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are chosen in order to best match the signal structures. Matching pursuits are general procedures to compute adaptive signal representations. With a dictionary of Gabor functions a matching pursuit defines an adaptive time-frequency transform. They derive a signal energy distribution in the time-frequency plane, which does not include interference terms, unlike Wigner and Cohen class distributions. A matching pursuit isolates the signal structures that are coherent with respect to a given dictionary. An application to pattern extraction from noisy signals is described. They compare a matching pursuit decomposition with a signal expansion over an optimized wavepacket orthonormal basis, selected with the algorithm of Coifman and Wickerhauser see (IEEE Trans. Informat. Theory, vol. 38, Mar. 1992).< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
The structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity can provide a good approximation to perceived image quality. This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions. We develop an image synthesis method to calibrate the parameters that define the relative importance of different scales. Experimental comparisons demonstrate the effectiveness of the proposed method.
A finite-difference method for solving the time-dependent NavierStokes equations for an incompressible fluid is introduced. This method uses the primitive variables, i.e. the velocities and the pressure, and is equally applicable to problems in two and three space dimensions. Test problems are solved, and an application to a three-dimensional convection problem is presented.
Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that 'similar" points in input space are mapped to nearby points on the manifold. We present a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold. The learning relies solely on neighborhood relationships and does not require any distancemeasure in the input space. The method can learn mappings that are invariant to certain transformations of the inputs, as is demonstrated with a number of experiments. Comparisons are made to other techniques, in particular LLE.
This paper is concerned with the mathematical structure of the immersed boundary (IB) method, which is intended for the computer simulation of fluid–structure interaction, especially in biological fluid dynamics. The IB formulation of such problems, derived here from the principle of least action, involves both Eulerian and Lagrangian variables, linked by the Dirac delta function. Spatial discretization of the IB equations is based on a fixed Cartesian mesh for the Eulerian variables, and a moving curvilinear mesh for the Lagrangian variables. The two types of variables are linked by interaction equations that involve a smoothed approximation to the Dirac delta function. Eulerian/Lagrangian identities govern the transfer of data from one mesh to the other. Temporal discretization is by a second-order Runge–Kutta method. Current and future research directions are pointed out, and applications of the IB method are briefly discussed.
We present a method for training a similarity metric from data. The method can be used for recognition or verification applications where the number of categories is very large and not known during training, and where the number of training samples for a single category is very small. The idea is to learn a function that maps input patterns into a target space such that the L/sub 1/ norm in the target space approximates the "semantic" distance in the input space. The method is applied to a face verification task. The learning process minimizes a discriminative loss function that drives the similarity metric to be small for pairs of faces from the same person, and large for pairs from different persons. The mapping from raw to the target space is a convolutional network whose architecture is designed for robustness to geometric distortions. The system is tested on the Purdue/AR face database which has a very high degree of variability in the pose, lighting, expression, position, and artificial occlusions such as dark glasses and obscuring scarves.
The mathematical characterization of singularities with Lipschitz exponents is reviewed. Theorems that estimate local Lipschitz exponents of functions from the evolution across scales of their wavelet transform are reviewed. It is then proven that the local maxima of the wavelet transform modulus detect the locations of irregular structures and provide numerical procedures to compute their Lipschitz exponents. The wavelet transform of singularities with fast oscillations has a particular behavior that is studied separately. The local frequency of such oscillations is measured from the wavelet transform modulus maxima. It has been shown numerically that one- and two-dimensional signals can be reconstructed, with a good approximation, from the local maxima of their wavelet transform modulus. As an application, an algorithm is developed that removes white noises from signals by analyzing the evolution of the wavelet transform maxima across scales. In two dimensions, the wavelet transform maxima indicate the location of edges in images.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains, such as graphs and manifolds. The purpose of this article is to overview different examples of geometric deep-learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.
The programming of a proof procedure is discussed in connection with trial runs and possible improvements.
We propose a family of Markov chain Monte Carlo methods whose performance is unaffected by affine tranformations of space. These algorithms are easy to construct and require little or no additional computational overhead. They should be particularly useful for sampling badly scaled distributions. Computational tests show that the affine invariant methods can be significantly faster than standard MCMC methods on highly skewed distributions.
The geometrical theory of diffraction is an extension of geometrical optics which accounts for diffraction. It introduces diffracted rays in addition to the usual rays of geometrical optics. These rays are produced by incident rays which hit edges, corners, or vertices of boundary surfaces, or which graze such surfaces. Various laws of diffraction, analogous to the laws of reflection and refraction, are employed to characterize the diffracted rays. A modified form of Fermat’s principle, equivalent to these laws, can also be used. Diffracted wave fronts are defined, which can be found by a Huygens wavelet construction. There is an associated phase or eikonal function which satisfies the eikonal equation. In addition complex or imaginary rays are introduced. A field is associated with each ray and the total field at a point is the sum of the fields on all rays through the point. The phase of the field on a ray is proportional to the optical length of the ray from some reference point. The amplitude varies in accordance with the principle of conservation of energy in a narrow tube of rays. The initial value of the field on a diffracted ray is determined from the incident field with the aid of an appropriate diffraction coefficient. These diffraction coefficients are determined from certain canonical problems. They all vanish as the wavelength tends to zero. The theory is applied to diffraction by an aperture in a thin screen diffraction by a disk, etc., to illustrate it. Agreement is shown between the predictions of the theory and various other theoretical analyses of some of these problems. Experimental confirmation of the theory is also presented. The mathematical justification of the theory on the basis of electromagnetic theory is described. Finally, the applicability of this theory, or a modification of it, to other branches of physics is explained.
Scene labeling consists of labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel. The method alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information. We report results using multiple postprocessing methods to produce the final labeling. Among those, we propose a technique to automatically retrieve, from a pool of segmentation components, an optimal set of components that best explain the scene; these components are arbitrary, for example, they can be taken from a segmentation tree or from any family of oversegmentations. The system yields record accuracies on the SIFT Flow dataset (33 classes) and the Barcelona dataset (170 classes) and near-record accuracy on Stanford background dataset (eight classes), while being an order of magnitude faster than competing approaches, producing a $(320\times 240)$ image labeling in less than a second, including feature extraction.
A multiscale Canny edge detection is equivalent to finding the local maxima of a wavelet transform. The authors study the properties of multiscale edges through the wavelet theory. For pattern recognition, one often needs to discriminate different types of edges. They show that the evolution of wavelet local maxima across scales characterize the local shape of irregular structures. Numerical descriptors of edge types are derived. The completeness of a multiscale edge representation is also studied. The authors describe an algorithm that reconstructs a close approximation of 1-D and 2-D signals from their multiscale edges. For images, the reconstruction errors are below visual sensitivity. As an application, a compact image coding algorithm that selects important edges and compresses the image data by factors over 30 has been implemented.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
It has long been assumed that sensory neurons are adapted, through both evolutionary and developmental processes, to the statistical properties of the signals to which they are exposed. Attneave (1954)Barlow (1961) proposed that information theory could provide a link between environmental statistics and neural responses through the concept of coding efficiency. Recent developments in statistical modeling, along with powerful computational tools, have enabled researchers to study more sophisticated statistical models for visual images, to validate these models empirically against large sets of data, and to begin experimentally testing the efficient coding hypothesis for both individual neurons and populations of neurons.
A numerical method for solving incompressible viscous flow problems is introduced. This method uses the velocities and the pressure as variables and is equally applicable to problems in two and three space dimensions. The principle of the method lies in the introduction of an artificial compressibility δ into the equations of motion, in such a way that the final results do not depend on δ. An application to thermal convection problems is presented.
Problems involving the classical linear partial differential equations of mathematical physics can be reduced to algebraic ones of a very much simpler structure by replacing the differentials by difference quotients on some (say rectilinear) mesh. This paper will undertake an elementary discussion of these algebraic problems, in particular of the behavior of the solution as the mesh width tends to zero. For present purposes we limit ourselves mainly to simple but typical cases, and treat them in such a way that the applicability of the method to more general difference equations and to those with arbitrarily many independent variables is made clear.
This book concerns the use of concepts from statistical physics in the description of financial systems. The authors illustrate the scaling concepts used in probability theory, critical phenomena, and fully developed turbulent fluids. These concepts are then applied to financial time series. The authors also present a stochastic model that displays several of the statistical properties observed in empirical data. Statistical physics concepts such as stochastic dynamics, short- and long-range correlations, self-similarity and scaling permit an understanding of the global behaviour of economic systems without first having to work out a detailed microscopic description of the system. Physicists will find the application of statistical physics concepts to economic systems interesting. Economists and workers in the financial world will find useful the presentation of empirical analysis methods and well-formulated theoretical tools that might help describe systems composed of a huge number of interacting subsystems.