NobleBlocks

Microsoft Research (India)

companyBengaluru, India

Research output, citation impact, and the most-cited recent papers from Microsoft Research (India) (India). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
2.9K
Citations
217.9K
h-index
192
i10-index
2.4K
Also known as
Microsoft Research (India)

Top-cited papers from Microsoft Research (India)

Scalable Person Re-identification: A Benchmark
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang +2 more
20154.6Kdoi:10.1109/iccv.2015.133

This paper contributes a new high quality dataset for person re-identification, named "Market-1501". Generally, current datasets: 1) are limited in scale, 2) consist of hand-drawn bboxes, which are unavailable under realistic settings, 3) have only one ground truth and one query image for each identity (close environment). To tackle these problems, the proposed Market-1501 dataset is featured in three aspects. First, it contains over 32,000 annotated bboxes, plus a distractor set of over 500K images, making it the largest person re-id dataset to date. Second, images in Market-1501 dataset are produced using the Deformable Part Model (DPM) as pedestrian detector. Third, our dataset is collected in an open system, where each identity has multiple images under each camera. As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor. We view person re-identification as a special task of image search. In experiment, we show that the proposed descriptor yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large-scale 500k dataset.

Dryad
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell +1 more
20072.5Kdoi:10.1145/1272996.1273005

Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad application combines computational "vertices" with communication "channels" to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of available computers, communicating as appropriate through flies, TCP pipes, and shared-memory FIFOs.

Nericell
Prashanth Mohan, Venkata N. Padmanabhan, Ramachandran Ramjee
20081.3Kdoi:10.1145/1460412.1460444

We consider the problem of monitoring road and traffic conditions in a city. Prior work in this area has required the deployment of dedicated sensors on vehicles and/or on the roadside, or the tracking of mobile phones by service providers. Furthermore, prior work has largely focused on the developed world, with its relatively simple traffic flow patterns. In fact, traffic flow in cities of the developing regions, which comprise much of the world, tends to be much more complex owing to varied road conditions (e.g., potholed roads), chaotic traffic (e.g., a lot of braking and honking), and a heterogeneous mix of vehicles (2-wheelers, 3-wheelers, cars, buses, etc.).

A More Credible Approach to Parallel Trends
Ashesh Rambachan, Jonathan Roth
2023· The Review of Economic Studies1.1Kdoi:10.1093/restud/rdad018

Abstract This paper proposes tools for robust inference in difference-in-differences and event-study designs where the parallel trends assumption may be violated. Instead of requiring that parallel trends holds exactly, we impose restrictions on how different the post-treatment violations of parallel trends can be from the pre-treatment differences in trends (“pre-trends”). The causal parameter of interest is partially identified under these restrictions. We introduce two approaches that guarantee uniformly valid inference under the imposed restrictions, and we derive novel results showing that they have desirable power properties in our context. We illustrate how economic knowledge can inform the restrictions on the possible violations of parallel trends in two economic applications. We also highlight how our approach can be used to conduct sensitivity analyses showing what causal conclusions can be drawn under various restrictions on the possible violations of the parallel trends assumption.

Explaining machine learning classifiers through diverse counterfactual explanations
Ramaravind Kommiya Mothilal, Amit Sharma, Chenhao Tan
20201.0Kdoi:10.1145/3351095.3372850

Post-hoc explanations of machine learning models are crucial for people to understand and act on algorithmic predictions. An intriguing class of explanations is through counterfactuals, hypothetical examples that show people how to obtain a different prediction. We posit that effective counterfactual explanations should satisfy two properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the counterfactuals presented. To this end, we propose a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes. To evaluate the actionability of counterfactuals, we provide metrics that enable comparison of counterfactual-based methods to other local explanation methods. We further address necessary tradeoffs and point to causal implications in optimizing for counterfactuals. Our experiments on four real-world datasets show that our framework can generate a set of counterfactuals that are diverse and well approximate local decision boundaries, outperforming prior approaches to generating diverse counterfactuals. We provide an implementation of the framework at https://github.com/microsoft/DiCE.

Zee
Anshul Rai, Krishna Chintalapudi, Venkata N. Padmanabhan, Rijurekha Sen
20121.0Kdoi:10.1145/2348543.2348580

Radio Frequency (RF) fingerprinting, based onWiFi or cellular signals, has been a popular approach to indoor localization. However, its adoption in the real world has been stymied by the need for sitespecific calibration, i.e., the creation of a training data set comprising WiFi measurements at known locations in the space of interest. While efforts have been made to reduce this calibration effort using modeling, the need for measurements from known locations still remains a bottleneck. In this paper, we present Zee -- a system that makes the calibration zero-effort, by enabling training data to be crowdsourced without any explicit effort on the part of users. Zee leverages the inertial sensors (e.g., accelerometer, compass, gyroscope) present in the mobile devices such as smartphones carried by users, to track them as they traverse an indoor environment, while simultaneously performing WiFi scans. Zee is designed to run in the background on a device without requiring any explicit user participation. The only site-specific input that Zee depends on is a map showing the pathways (e.g., hallways) and barriers (e.g., walls). A significant challenge that Zee surmounts is to track users without any a priori, user-specific knowledge such as the user's initial location, stride-length, or phone placement. Zee employs a suite of novel techniques to infer location over time: (a) placement-independent step counting and orientation estimation, (b) augmented particle filtering to simultaneously estimate location and user-specific walk characteristics such as the stride length,(c) back propagation to go back and improve the accuracy of ocalization in the past, and (d) WiFi-based particle initialization to enable faster convergence. We present an evaluation of Zee in a large office building.

Control-flow integrity
Martı́n Abadi, Mihai Budiu, Úlfar Erlingsson, Jay Ligatti
20051.0Kdoi:10.1145/1102120.1102165

Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally even with respect to powerful adversaries. Moreover, CFI enforcement is practical: it is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies, as we demonstrate with efficient software implementations of a protected shadow call stack and of access control for memory regions.

Diversifying search results
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, Samuel Ieong
20091.0Kdoi:10.1145/1498759.1498766

We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We propose an algorithm that well approximates this objective in general, and is provably optimal for a natural special case. Furthermore, we generalize several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. We demonstrate empirically that our algorithm scores higher in these generalized metrics compared to results produced by commercial search engines.

Indoor localization without the pain
Krishna Chintalapudi, Anand Iyer, Venkata N. Padmanabhan
2010972doi:10.1145/1859995.1860016

While WiFi-based indoor localization is attractive, the need for a significant degree of pre-deployment effort is a key challenge. In this paper, we ask the question: can we perform indoor localization with no pre-deployment effort? Our setting is an indoor space, such as an office building or a mall, with WiFi coverage but where we do not assume knowledge of the physical layout, including the placement of the APs. Users carrying WiFi-enabled devices such as smartphones traverse this space in normal course. The mobile devices record Received Signal Strength (RSS) measurements corresponding to APs in their view at various (unknown) locations and report these to a localization server. Occasionally, a mobile device will also obtain and report a location fix, say by obtaining a GPS lock at the entrance or near a window. The centerpiece of our work is the EZ Localization algorithm, which runs on the localization server. The key intuition is that all of the observations reported to the server, even the many from unknown locations, are constrained by the physics of wireless propagation. EZ models these constraints and then uses a genetic algorithm to solve them. The results from our deployment in two different buildings are promising. Despite the absence of any explicit pre-deployment calibration, EZ yields a median localization error of 2m and 7m, respectively, in a small building and a large building, which is only somewhat worse than the 0.7m and 4m yielded by the best-performing but calibrationintensive Horus scheme [29] from prior work.

Robust Loss Functions under Label Noise for Deep Neural Networks
Aritra Ghosh, Himanshu Kumar, P. S. Sastry
2017· Proceedings of the AAAI Conference on Artificial Intelligence906doi:10.1609/aaai.v31i1.10894

In many applications of classifier learning, training data suffers from label noise. Deep networks are learned using huge training data where the problem of noisy labels is particularly relevant. The current techniques proposed for learning deep networks under label noise focus on modifying the network architecture and on algorithms for estimating true labels from noisy labels. An alternate approach would be to look for loss functions that are inherently noise-tolerant. For binary classification there exist theoretical results on loss functions that are robust to label noise. In this paper, we provide some sufficient conditions on a loss function so that risk minimization under that loss function would be inherently tolerant to label noise for multiclass classification problems. These results generalize the existing results on noise-tolerant loss functions for binary classification. We study some of the widely used loss functions in deep networks and show that the loss function based on mean absolute value of error is inherently robust to label noise. Thus standard back propagation is enough to learn the true classifier even under label noise. Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks.

Design tradeoffs for SSD performance
Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis +2 more
2008885

Solid-state disks (SSDs) have the potential to revolutionize the storage system landscape. However, there is little published work about their internal organization or the design choices that SSD manufacturers face in pursuit of optimal performance. This paper presents a taxonomy of such design choices and analyzes the likely performance of various configurations using a trace-driven simulator and workload traces extracted from real systems. We find that SSD performance and lifetime is highly workloadsensitive, and that complex systems problems that normally appear higher in the storage stack, or even in distributed systems, are relevant to device firmware. 1

Low-rank matrix completion using alternating minimization
Prateek Jain, Praneeth Netrapalli, Sujay Sanghavi
2013875doi:10.1145/2488608.2488693

Alternating minimization represents a widely applicable and empirically successful approach for finding low-rank matrices that best fit the given data. For example, for the problem of low-rank matrix completion, this method is believed to be one of the most accurate and efficient, and formed a major component of the winning entry in the Netflix Challenge [17].

Multiple kernels for object detection
Andrea Vedaldi, Varun Gulshan, Manik Varma, Andrew Zisserman
2009767doi:10.1109/iccv.2009.5459183

Our objective is to obtain a state-of-the art object category detector by employing a state-of-the-art image classifier to search for the object in all possible image sub-windows. We use multiple kernel learning of Varma and Ray (ICCV 2007) to learn an optimal combination of exponential χ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> kernels, each of which captures a different feature channel. Our features include the distribution of edges, dense and sparse visual words, and feature descriptors at different levels of spatial organization. Such a powerful classifier cannot be tested on all image sub-windows in a reasonable amount of time. Thus we propose a novel three-stage classifier, which combines linear, quasi-linear, and non-linear kernel SVMs. We show that increasing the non-linearity of the kernels increases their discriminative power, at the cost of an increased computational complexity. Our contributions include (i) showing that a linear classifier can be evaluated with a complexity proportional to the number of sub-windows (independent of the sub-window area and descriptor dimension); (ii) a comparison of three efficient methods of proposing candidate regions (including the jumping window classifier of Chum and Zisserman (CVPR 2007) based on proposing windows from scale invariant features); and (Hi) introducing overlap-recall curves as a mean to compare and optimize the performance of the intermediate pipeline stages. The method is evaluated on the PASCAL Visual Object Detection Challenge, and exceeds the performances of previously published methods for most of the classes.

PipeDream
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri +4 more
2019745doi:10.1145/3341301.3359646

DNN training is extremely time-consuming, necessitating efficient multi-accelerator parallelization. Current approaches to parallelizing training primarily use intra-batch parallelization, where a single iteration of training is split over the available workers, but suffer from diminishing returns at higher worker counts. We present PipeDream, a system that adds inter-batch pipelining to intra-batch parallelism to further improve parallel training throughput, helping to better overlap computation with communication and reduce the amount of communication when possible. Unlike traditional pipelining, DNN training is bi-directional, where a forward pass through the computation graph is followed by a backward pass that uses state and intermediate data computed during the forward pass. Naïve pipelining can thus result in mismatches in state versions used in the forward and backward passes, or excessive pipeline flushes and lower hardware efficiency. To address these challenges, PipeDream versions model parameters for numerically correct gradient computations, and schedules forward and backward passes of different minibatches concurrently on different workers with minimal pipeline stalls. PipeDream also automatically partitions DNN layers among workers to balance work and minimize communication. Extensive experimentation with a range of DNN tasks, models, and hardware configurations shows that PipeDream trains models to high accuracy up to 5.3X faster than commonly used intra-batch parallelism techniques.

Research Approaches to Mobile Use in the Developing World: A Review of the Literature
Jonathan Donner
2008· The Information Society731doi:10.1080/01972240802019970

Abstract This paper reviews roughly 200 recent studies of mobile (cellular) phone use in the developing world, and identifies major concentrations of research. It categorizes studies along two dimensions. One dimension distinguishes studies of the determinants of mobile adoption from those that assess the impacts of mobile use, and from those focused on the interrelationships between mobile technologies and users. A secondary dimension identifies a subset of studies with a strong economic development perspective. The discussion considers the implications of the resulting review and typology for future research. Keywords: cellular telephonedeveloping countriesmobile telephonetelecommunications An earlier version of the paper was presented at the International Conference on Mobile Communication and Asian Modernities, City University of Hong Kong, 7–8 June 2005. The author thanks the Earth Institute at Columbia University for support during the writing of the early draft of paper, and to numerous readers—particularly the three anonymous reviewers—for their suggestions. Opinions and analysis are the author's, and not necessarily those of Microsoft Corporation. Notes ∗South Africa studies are included in the review despite upper-middle-income classification. ∗∗World totals N = 208, excludes some small geographies not appearing in the World Bank Classifications or in the ITU statistics. As of October 2007, at http://members.aol.com/leshaddon/MobileRefs.html and http://ist-socrates.berkeley.edu/∼nalinik/mobile.html. As of October 2007, http://www.scils.rutgers.edu/ci/cmcs/events lists many of these conferences.

DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu +3 more
2008702doi:10.5555/1855741.1855742

DryadLINQ is a system and a set of language extensions that enable a new programming model for large scale dis-tributed computing. It generalizes previous execution en-vironments such as SQL, MapReduce, and Dryad in two ways: by adopting an expressive data model of strongly typed.NET objects; and by supporting general-purpose imperative and declarative operations on datasets within a traditional high-level programming language. A DryadLINQ program is a sequential program com-posed of LINQ expressions performing arbitrary side-effect-free transformations on datasets, and can be writ-ten and debugged using standard.NET development tools. The DryadLINQ system automatically and trans-parently translates the data-parallel portions of the pro-gram into a distributed execution plan which is passed to the Dryad execution platform. Dryad, which has been in continuous operation for several years on production clusters made up of thousands of computers, ensures ef-ficient, reliable execution of this plan. We describe the implementation of the DryadLINQ compiler and runtime. We evaluate DryadLINQ on a varied set of programs drawn from domains such as web-graph analysis, large-scale log mining, and machine learning. We show that excellent absolute performance can be attained—a general-purpose sort of 1012 Bytes of data executes in 319 seconds on a 240-computer, 960-disk cluster—as well as demonstrating near-linear scal-ing of execution time on representative applications as we vary the number of computers used for a job. 1

Distributed large-scale natural graph factorization
Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski +1 more
2013676doi:10.1145/2488388.2488393

Natural graphs, such as social networks, email graphs, or instant messaging patterns, have become pervasive through the internet. These graphs are massive, often containing hundreds of millions of nodes and billions of edges. While some theoretical models have been proposed to study such graphs, their analysis is still difficult due to the scale and nature of the data.

Naiad
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard +2 more
2013649doi:10.1145/2517349.2522738

Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on multiple platforms, at the expense of efficiency, maintainability, and simplicity. Naiad resolves the complexities of combining these features in one framework.

Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds
Raef Bassily, Adam Smith, Abhradeep Thakurta
2014646doi:10.1109/focs.2014.56

Convex empirical risk minimization is a basic tool in machine learning and statistics. We provide new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded. We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex. Our algorithms run in polynomial time, and in some cases even match the optimal nonprivate running time (as measured by oracle complexity). We give separate algorithms (and lower bounds) for (ε, 0)and (ε, δ)-differential privacy; perhaps surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different. Our lower bounds apply even to very simple, smooth function families, such as linear and quadratic functions. This implies that algorithms from previous work can be used to obtain optimal error rates, under the additional assumption that the contributions of each data point to the loss function is smooth. We show that simple approaches to smoothing arbitrary loss functions (in order to apply previous techniques) do not yield optimal error rates. In particular, optimal algorithms were not previously known for problems such as training support vector machines and the high-dimensional median.

f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
Sebastian Nowozin, Botond Cseke, Ryota Tomioka
2016· arXiv (Cornell University)639doi:10.48550/arxiv.1606.00709

Generative neural samplers are probabilistic models that implement sampling using feedforward neural networks: they take a random input vector and produce a sample from a probability distribution defined by the network weights. These models are expressive and allow efficient computation of samples and derivatives, but cannot be used for computing likelihoods or for marginalization. The generative-adversarial training method allows to train such models through the use of an auxiliary discriminative neural network. We show that the generative-adversarial approach is a special case of an existing more general variational divergence estimation approach. We show that any f-divergence can be used for training generative neural samplers. We discuss the benefits of various choices of divergence functions on training complexity and the quality of the obtained generative models.