NobleBlocks

Microsoft (Israel)

companyHerzliya, Israel

Research output, citation impact, and the most-cited recent papers from Microsoft (Israel) (Israel). Aggregated across the NobleBlocks index of 300M+ scholarly works.

Total works
548
Citations
19.2K
h-index
71
i10-index
379
Also known as
Microsoft (Israel)

Top-cited papers from Microsoft (Israel)

Accurate, Robust, and Flexible Real-time Hand Tracking
Toby Sharp, Cem Keskin, Duncan Robertson, Jonathan M. Taylor +4 more
2015428doi:10.1145/2702123.2702179

We present a new real-time hand tracking system based on a single depth camera. The system can accurately reconstruct complex hand poses across a variety of subjects. It also allows for robust tracking, rapidly recovering from any temporary failures. Most uniquely, our tracker is highly flexible, dramatically improving upon previous approaches which have focused on front-facing close-range scenarios. This flexibility opens up new possibilities for human-computer interaction with examples including tracking at distances from tens of centimeters through to several meters (for controlling the TV at a distance), supporting tracking using a moving depth camera (for mobile scenarios), and arbitrary camera placements (for VR headsets). These features are achieved through a new pipeline that combines a multi-layered discriminative reinitialization strategy for per-frame pose estimation, followed by a generative model-fitting stage. We provide extensive technical details and a detailed qualitative and quantitative analysis.

Approximate mechanism design without money
Ariel D. Procaccia, Moshe Tennenholtz
2009320doi:10.1145/1566374.1566401

The literature on algorithmic mechanism design is mostly concerned with game-theoretic versions of optimization problems to which standard economic money-based mechanisms cannot be applied efficiently. Recent years have seen the design of various truthful approximation mechanisms that rely on enforcing payments. In this paper, we advocate the reconsideration of highly structured optimization problems in the context of mechanism design. We argue that, in such domains, approximation can be leveraged to obtain truthfulness without resorting to payments. This stands in contrast to previous work where payments are ubiquitous, and (more often than not) approximation is a necessary evil that is required to circumvent computational complexity.

Tracking COVID-19 using online search
Vasileios Lampos, Maimuna S. Majumder, Elad Yom‐Tov, Michael Edelstein +4 more
2021· npj Digital Medicine165doi:10.1038/s41746-021-00384-w

Previous research has demonstrated that various properties of infectious diseases can be inferred from online search behaviour. In this work we use time series of online search query frequencies to gain insights about the prevalence of COVID-19 in multiple countries. We first develop unsupervised modelling techniques based on associated symptom categories identified by the United Kingdom's National Health Service and Public Health England. We then attempt to minimise an expected bias in these signals caused by public interest-as opposed to infections-using the proportion of news media coverage devoted to COVID-19 as a proxy indicator. Our analysis indicates that models based on online searches precede the reported confirmed cases and deaths by 16.7 (10.2-23.2) and 22.1 (17.4-26.9) days, respectively. We also investigate transfer learning techniques for mapping supervised models from countries where the spread of the disease has progressed extensively to countries that are in earlier phases of their respective epidemic curves. Furthermore, we compare time series of online search activity against confirmed COVID-19 cases or deaths jointly across multiple countries, uncovering interesting querying patterns, including the finding that rarer symptoms are better predictors than common ones. Finally, we show that web searches improve the short-term forecasting accuracy of autoregressive models for COVID-19 deaths. Our work provides evidence that online search data can be used to develop complementary public health surveillance methods to help inform the COVID-19 response in conjunction with more established approaches.

Speeding up the Xbox recommender system using a euclidean transformation for inner-product spaces
Yoram Bachrach, Yehuda Finkelstein, Ran Gilad-Bachrach, Liran Katzir +3 more
2014160doi:10.1145/2645710.2645741

A prominent approach in collaborative filtering based recommender systems is using dimensionality reduction (matrix factorization) techniques to map users and items into low-dimensional vectors. In such systems, a higher inner product between a user vector and an item vector indicates that the item better suits the user's preference. Traditionally, retrieving the most suitable items is done by scoring and sorting all items. Real world online recommender systems must adhere to strict response-time constraints, so when the number of items is large, scoring all items is intractable.

Approximately optimal mechanism design via differential privacy
Kobbi Nissim, Rann Smorodinsky, Moshe Tennenholtz
2012144doi:10.1145/2090236.2090254

We study the implementation challenge in an abstract interdependent values model and an arbitrary objective function. We design a generic mechanism that allows for approximate optimal implementation of insensitive objective functions in ex-post Nash equilibrium. If, furthermore, values are private then the same mechanism is strategy proof. We cast our results onto two specific models: pricing and facility location. The mechanism we design is optimal up to an additive factor of the order of magnitude of one over the square root of the number of agents and involves no utility transfers.

Applying Multiple Data Collection Tools to Quantify Human Papillomavirus Vaccine Communication on Twitter
Philip M. Massey, Amy Leader, Elad Yom‐Tov, Alexandra Budenz +2 more
2016· Journal of Medical Internet Research141doi:10.2196/jmir.6670

BACKGROUND: Human papillomavirus (HPV) is the most common sexually transmitted infection in the United States. There are several vaccines that protect against strains of HPV most associated with cervical and other cancers. Thus, HPV vaccination has become an important component of adolescent preventive health care. As media evolves, more information about HPV vaccination is shifting to social media platforms such as Twitter. Health information consumed on social media may be especially influential for segments of society such as younger populations, as well as ethnic and racial minorities. OBJECTIVE: The objectives of our study were to quantify HPV vaccine communication on Twitter, and to develop a novel methodology to improve the collection and analysis of Twitter data. METHODS: We collected Twitter data using 10 keywords related to HPV vaccination from August 1, 2014 to July 31, 2015. Prospective data collection used the Twitter Search API and retrospective data collection used Twitter Firehose. Using a codebook to characterize tweet sentiment and content, we coded a subsample of tweets by hand to develop classification models to code the entire sample using machine learning procedures. We also documented the words in the 140-character tweet text most associated with each keyword. We used chi-square tests, analysis of variance, and nonparametric equality of medians to test for significant differences in tweet characteristic by sentiment. RESULTS: A total of 193,379 English-language tweets were collected, classified, and analyzed. Associated words varied with each keyword, with more positive and preventive words associated with "HPV vaccine" and more negative words associated with name-brand vaccines. Positive sentiment was the largest type of sentiment in the sample, with 75,393 positive tweets (38.99% of the sample), followed by negative sentiment with 48,940 tweets (25.31% of the sample). Positive and neutral tweets constituted the largest percentage of tweets mentioning prevention or protection (20,425/75,393, 27.09% and 6477/25,110, 25.79%, respectively), compared with only 11.5% of negative tweets (5647/48,940; P<.001). Nearly one-half (22,726/48,940, 46.44%) of negative tweets mentioned side effects, compared with only 17.14% (12,921/75,393) of positive tweets and 15.08% of neutral tweets (3787/25,110; P<.001). CONCLUSIONS: Examining social media to detect health trends, as well as to communicate important health information, is a growing area of research in public health. Understanding the content and implications of conversations that form around HPV vaccination on social media can aid health organizations and health-focused Twitter users in creating a meaningful exchange of ideas and in having a significant impact on vaccine uptake. This area of research is inherently interdisciplinary, and this study supports this movement by applying public health, health communication, and data science approaches to extend methodologies across fields.

Predicting user adherence to behavioral eHealth interventions in the real world: examining which aspects of intervention design matter most
Amit Baumel, Elad Yom‐Tov
2018· Translational Behavioral Medicine137doi:10.1093/tbm/ibx037

Existing frameworks have identified a range of intervention design features that may facilitate adherence to eHealth interventions; however, empirical data are lacking on whether intervention design features can predict user adherence in the real world-where the public access available tools-and whether some design aspects of behavioral eHealth interventions are more important than others in predicting adherence. This study examined whether intervention design qualities predict user adherence to behavioral eHealth interventions in real-world use and which qualities matter the most. We correlated the online activities of users of 30 web-based behavioral interventions-collected from a proprietary data set of anonymized logs from consenting users of Microsoft Internet Explorer add-on-with interventions' quality ratings obtained by trained raters prior to empirical examination. The quality ratings included: Usability, Visual Design, User Engagement, Content, Therapeutic Persuasiveness (i.e., persuasive design and incorporation of behavior change techniques), and Therapeutic Alliance. We found Therapeutic Persuasiveness (i.e., the incorporation of persuasive design/behavior change principles) to be the most robust predictor of adherence (i.e., duration of use, number of unique sessions; 40 ≤ rs ≤ .58, ps ≤ .005), explaining 42% of the variance in user adherence in our regression model. Results indicated up to six times difference in the percentage of users utilizing the interventions for more than a minimum amount of time and sessions based on Therapeutic Persuasiveness. Findings suggest the importance of persuasive design and behavior change techniques incorporation during the design and evaluation of digital behavioral interventions.

Strategyproof Approximation of the Minimax on Networks
Noga Alon, Michal Feldman, Ariel D. Procaccia, Moshe Tennenholtz
2010· Mathematics of Operations Research132doi:10.1287/moor.1100.0457

We consider the problem of locating a facility on a network represented by a graph. A set of strategic agents have different ideal locations for the facility; the cost of an agent is the distance between its ideal location and the facility. A mechanism maps the locations reported by the agents to the location of the facility. We wish to design mechanisms that are strategyproof (SP) in the sense that agents can never benefit by lying and, at the same time, provide a small approximation ratio with respect to the minimax measure. We design a novel “hybrid” strategyproof randomized mechanism that provides a tight approximation ratio of 3/2 when the network is a circle (known as a ring in the case of computer networks). Furthermore, we show that no randomized SP mechanism can provide an approximation ratio better than 2 - o(1), even when the network is a tree, thereby matching a trivial upper bound of two.

Manual for Using Homomorphic Encryption for Bioinformatics
Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin Lauter +2 more
2017· Proceedings of the IEEE132doi:10.1109/jproc.2016.2622218

Biological data science is an emerging field facing multiple challenges for hosting, sharing, computing on, and interacting with large data sets. Privacy regulations and concerns about the risks of leaking sensitive personal health and genomic data add another layer of complexity to the problem. Recent advances in cryptography over the last five years have yielded a tool, homomorphic encryption, which can be used to encrypt data in such a way that storage can be outsourced to an untrusted cloud, and the data can be computed on in a meaningful way in encrypted form, without access to decryption keys. This paper introduces homomorphic encryption to the bioinformatics community, and presents an informal “manual” for using the Simple Encrypted Arithmetic Library (SEAL), which we have made publicly available for bioinformatic, genomic, and other research purposes.

Approximate Revenue Maximization with Multiple Items
Sergiu Hart, Noam Nisan
2012120doi:10.48550/arxiv.1204.1846

Maximizing the revenue from selling _more than one_ good (or item) to a single buyer is a notoriously difficult problem, in stark contrast to the one-good case. For two goods, we show that simple "one-dimensional" mechanisms, such as selling the goods separately, _guarantee_ at least 73% of the optimal revenue when the valuations of the two goods are independent and identically distributed, and at least $50\%$ when they are independent. For the case of $k&gt;2$ independent goods, we show that selling them separately guarantees at least a $c/\log^2 k$ fraction of the optimal revenue; and, for independent and identically distributed goods, we show that selling them as one bundle guarantees at least a $c/\log k$ fraction of the optimal revenue. Additional results compare the revenues from the two simple mechanisms of selling the goods separately and bundled, identify situations where bundling is optimal, and extend the analysis to multiple buyers.

mHealth app using machine learning to increase physical activity in diabetes and depression: clinical trial protocol for the DIAMANTE Study
Adrián Aguilera, Caroline Figueroa, Rosa Hernandez-Ramos, Urmimala Sarkar +4 more
2020· BMJ Open109doi:10.1136/bmjopen-2019-034723

INTRODUCTION: Depression and diabetes are highly disabling diseases with a high prevalence and high rate of comorbidity, particularly in low-income ethnic minority patients. Though comorbidity increases the risk of adverse outcomes and mortality, most clinical interventions target these diseases separately. Increasing physical activity might be effective to simultaneously lower depressive symptoms and improve glycaemic control. Self-management apps are a cost-effective, scalable and easy access treatment to increase physical activity. However, cutting-edge technological applications often do not reach vulnerable populations and are not tailored to an individual's behaviour and characteristics. Tailoring of interventions using machine learning methods likely increases the effectiveness of the intervention. METHODS AND ANALYSIS: In a three-arm randomised controlled trial, we will examine the effect of a text-messaging smartphone application to encourage physical activity in low-income ethnic minority patients with comorbid diabetes and depression. The adaptive intervention group receives messages chosen from different messaging banks by a reinforcement learning algorithm. The uniform random intervention group receives the same messages, but chosen from the messaging banks with equal probabilities. The control group receives a weekly mood message. We aim to recruit 276 adults from primary care clinics aged 18-75 years who have been diagnosed with current diabetes and show elevated depressive symptoms (Patient Health Questionnaire depression scale-8 (PHQ-8) >5). We will compare passively collected daily step counts, self-report PHQ-8 and most recent haemoglobin A1c from medical records at baseline and at intervention completion at 6-month follow-up. ETHICS AND DISSEMINATION: The Institutional Review Board at the University of California San Francisco approved this study (IRB: 17-22608). We plan to submit manuscripts describing our user-designed methods and testing of the adaptive learning algorithm and will submit the results of the trial for publication in peer-reviewed journals and presentations at (inter)-national scientific meetings. TRIAL REGISTRATION NUMBER: NCT03490253; pre-results.

Complexity of unweighted coalitional manipulation under some common voting rules
Lirong Xia, Michael Zuckerman, Ariel D. Procaccia, Vincent Conitzer +1 more
2009109

Understanding the computational complexity of manipulation in elections is arguably the most central agenda in Computational Social Choice. One of the influential variations of the the problem involves a coalition of manipulators trying to make a favorite candidate win the election. Although the complexity of the problem is well-studied under the assumption that the voters are weighted, there were very few successful attempts to abandon this strong assumption. In this paper, we study the complexity of the unweighted coalitional manipulation problem (UCM) under several prominent voting rules. Our main result is that UCM is NP-complete under the maximin rule; this resolves an enigmatic open question. We then show that UCM is NP-complete under the ranked pairs rule, even with respect to a single manipulator. Furthermore, we provide an extreme hardness-of-approximation result for an optimization version of UCM under ranked pairs. Finally, we show that UCM under the Bucklin rule is in P. 1

Estimating clustering coefficients and size of social networks via random walk
Stephen J. Hardiman, Liran Katzir
2013107doi:10.1145/2488388.2488436

Online social networks have become a major force in today's society and economy. The largest of today's social networks may have hundreds of millions to more than a billion users. Such networks are too large to be downloaded or stored locally, even if terms of use and privacy policies were to permit doing so. This limitation complicates even simple computational tasks. One such task is computing the clustering coefficient of a network. Another task is to compute the network size (number of registered users) or a subpopulation size.

Size-independent sample complexity of neural networks
Noah Golowich, Alexander Rakhlin, Ohad Shamir
2019· Information and Inference A Journal of the IMA104doi:10.1093/imaiai/iaz007

Abstract We study the sample complexity of learning neural networks by providing new bounds on their Rademacher complexity, assuming norm constraints on the parameter matrix of each layer. Compared to previous work, these complexity bounds have improved dependence on the network depth and, under some additional assumptions, are fully independent of the network size (both depth and width). These results are derived using some novel techniques, which may be of independent interest.

DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory
Carlos Villavieja, Vasileios Karakostas, Lluís Vilanova, Yoav Etsion +4 more
2011101doi:10.1109/pact.2011.65

Translation Look aside Buffers (TLBs) are ubiquitously used in modern architectures to cache virtual-to-physical mappings and, as they are looked up on every memory access, are paramount to performance scalability. The emergence of chip-multiprocessors (CMPs) with per-core TLBs, has brought the problem of TLB coherence to front stage. TLBs are kept coherent at the software-level by the operating system (OS). Whenever the OS modifies page permissions in a page table, it must initiate a coherency transaction among TLBs, a process known as a TLB shoot down. Current CMPs rely on the OS to approximate the set of TLBs caching a mapping and synchronize TLBs using costly Inter-Proceessor Interrupts (IPIs) and software handlers. In this paper, we characterize the impact of TLB shoot downs on multiprocessor performance and scalability, and present the design of a scalable TLB coherency mechanism. First, we show that both TLB shoot down cost and frequency increase with the number of processors and project that software-based TLB shoot downs would thwart the performance of large multiprocessors. We then present a scalable architectural mechanism that couples a shared TLB directory with load/store queue support for lightweight TLB invalidation, and thereby eliminates the need for costly IPIs. Finally, we show that the proposed mechanism reduces the fraction of machine cycles wasted on TLB shoot downs by an order of magnitude.

Compact name-independent routing with minimum stretch
Ittai Abraham, Cyril Gavoille, Dahlia Malkhi, Noam Nisan +1 more
2008· ACM Transactions on Algorithms98doi:10.1145/1367064.1367077

Given a weighted undirected network with arbitrary node names, we present a compact routing scheme, using a Õ (√n,) space routing table at each node, and routing along paths of stretch 3, that is, at most thrice as long as the minimum cost paths. This is optimal in a very strong sense. It is known that no compact routing using o ( n ) space per node can route with stretch below 3. Also, it is known that any stretch below 5 requires Ω(√ n ,)space per node.

The Impact of Delay Announcements on Hospital Network Coordination and Waiting Times
Jing Dong, Elad Yom‐Tov, Johan S. H. van Leeuwaarden
2018· Management Science94doi:10.1287/mnsc.2018.3048

We investigate the impact of delay announcements on the coordination within hospital networks using a combination of empirical observations and numerical experiments. We show that patients take delay information into account when choosing emergency service providers and that such information can help increase coordination in the network, leading to improvements in performance of the network, as measured by Emergency Department wait times. Our numerical results indicate that the level of coordination that can be achieved is limited by the patients ’ sensitivity to waiting, the load of the system, the heterogeneity among hospitals, and, importantly, the method hospital use to estimate delays. We show that delay estimators that are based on historical average may cause oscillation in the system and lead to higher average waiting times when patients are sensitive to delay. We provide empirical evidence which suggests that such oscillations occurs in hospital networks in the US.

Many-Core vs. Many-Thread Machines: Stay Away From the Valley
Zvika Guz, Evgeny Bolotin, Idit Keidar, Avinoam Kolodny +2 more
2009· IEEE Computer Architecture Letters94doi:10.1109/l-ca.2009.4

We study the tradeoffs between many-core machines like Intelpsilas Larrabee and many-thread machines like Nvidia and AMD GPGPUs. We define a unified model describing a superposition of the two architectures, and use it to identify operation zones for which each machine is more suitable. Moreover, we identify an intermediate zone in which both machines deliver inferior performance. We study the shape of this ldquoperformance valleyrdquo and provide insights on how it can be avoided.

Mechanisms for multi-level marketing
Yuval Emek, Ron Karidi, Moshe Tennenholtz, Aviv Zohar
201193doi:10.1145/1993574.1993606

Multi-level marketing is a marketing approach that motivates its participants to promote a certain product among their friends. The popularity of this approach increases due to the accessibility of modern social networks, however, it existed in one form or the other long before the Internet age began (the infamous Pyramid scheme that dates back at least a century is in fact a special case of multi-level marketing). This paper lays foundations for the study of reward mechanisms in multi-level marketing within social networks. We provide a set of desired properties for such mechanisms and show that they are uniquely satisfied by geometric reward mechanisms. The resilience of mechanisms to false-name manipulations is also considered; while geometric reward mechanisms fail against such manipulations, we exhibit other mechanisms which are false-name-proof.

Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
202490doi:10.18653/v1/2024.emnlp-main.15

Large language models (LLMs) encapsulate a vast amount of factual information within their pre-trained weights, as evidenced by their ability to answer diverse questions across different domains.However, this knowledge is inherently limited, relying heavily on the characteristics of the training data.Consequently, using external datasets to incorporate new information or refine the capabilities of LLMs on previously seen information poses a significant challenge.In this study, we compare two common approaches: unsupervised fine-tuning and retrieval-augmented generation (RAG).We evaluate both approaches on a variety of knowledge-intensive tasks across different topics.Our findings reveal that while unsupervised fine-tuning offers some improvement, RAG consistently outperforms it, both for existing knowledge encountered during training and entirely new knowledge.Moreover, we find that LLMs struggle to learn new factual information through unsupervised fine-tuning, and that exposing them to numerous variations of the same fact during training could alleviate this problem.