IBM (Canada)
companyMarkham, Ontario, Canada
Research output, citation impact, and the most-cited recent papers from IBM (Canada) (Canada). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from IBM (Canada)
It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes.We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places}; lightweight activities embodied in async, future, foreach, and ateach constructs; a construct for termination detection (finish); the use of lock-free synchronization (atomic blocks); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java™ languages.
Resampling methods are commonly used for dealing with the class‐imbalance problem. Their advantage over other methods is that they are external and thus, easily transportable. Although such approaches can be very simple to implement, tuning them most effectively is not an easy task. In particular, it is unclear whether oversampling is more effective than undersampling and which oversampling or undersampling rate should be used. This paper presents an experimental study of these questions and concludes that combining different expressions of the resampling approach is an effective solution to the tuning problem. The proposed combination scheme is evaluated on imbalanced subsets of the Reuters‐21578 text collection and is shown to be quite effective for these problems.
This paper reports the results of a recent survey of user-centered design (UCD) practitioners. The survey involved over a hundred respondents who were CHI'2000 attendees or current UPA members. The paper identifies the most widely used methods and processes, the key factors that predict success, and the critical tradeoffs practitioners must make in applying UCD methods and processes. Results show that cost-benefit tradeoffs are a key consideration in the adoption of UCD methods. Measures of UCD effectiveness are lacking and rarely applied. There is also a major discrepancy between the commonly cited measures and the actually applied ones. These results have implications for the introduction, deployment, and execution of UCD projects
User-Centered Design (UCD) is a multidisciplinary design approach based on the active involvement of users to improve the understanding of user and task requirements, and the iteration of design and evaluation. It is widely considered the key to product usefulness and usability---an effective approach to overcoming the limitations of traditional system-centered design. Much has been written in the research literature about UCD. As further proof of internationally endorsed best practice, UCD processes are also defined in ISO documents, including ISO 13407 and the associated technical report, ISO TR 18529. Increasingly, UCD has become part of the cultural vernacular of the executives and managers who drive technology development in companies of all sizes.
Similarity or distance measures are core components used by distance-based clustering algorithms to cluster similar data points into the same clusters, while dissimilar or distant data points are placed into different clusters. The performance of similarity measures is mostly addressed in two or three-dimensional spaces, beyond which, to the best of our knowledge, there is no empirical study that has revealed the behavior of similarity measures when dealing with high-dimensional datasets. To fill this gap, a technical framework is proposed in this study to analyze, compare and benchmark the influence of different similarity measures on the results of distance-based clustering algorithms. For reproducibility purposes, fifteen publicly available datasets were used for this study, and consequently, future distance measures can be evaluated and compared with the results of the measures discussed in this work. These datasets were classified as low and high-dimensional categories to study the performance of each measure against each category. This research should help the research community to identify suitable distance measures for datasets and also to facilitate a comparison and evaluation of the newly proposed similarity or distance measures with traditional ones.
OpenMP has been very successful in exploiting structured parallelism in applications. With increasing application complexity, there is a growing need for addressing irregular parallelism in the presence of complicated control structures. This is evident in various efforts by the industry and research communities to provide a solution to this challenging problem. One of the primary goals of OpenMP 3.0 is to define a standard dialect to express and efficiently exploit unstructured parallelism. This paper presents the design of the OpenMP tasking model by members of the OpenMP 3.0 tasking sub-committee which was formed for this purpose. The paper summarizes the efforts of the sub-committee (spanning over two years) in designing, evaluating and seamlessly integrating the tasking model into the OpenMP specification. In this paper, we present the design goals and key features of the tasking model, including a rich set of examples and an in-depth discussion of the rationale behind various design choices. We compare a prototype implementation of the tasking model with existing models, and evaluate it on a wide range of applications. The comparison shows that the OpenMP tasking model provides expressiveness, flexibility, and huge potential for performance and scalability.
Several geometric active contour models have been proposed for segmentation in computer vision and image analysis. The essential idea is to evolve a curve (in 2D) or a surface (in 3D) under constraints from image forces so that it clings to features of interest in an intensity image. Recent variations on this theme take into account properties of enclosed regions and allow for multiple curves or surfaces to be simultaneously represented. However, it is still unclear how to apply these techniques to images of narrow elongated structures, such as blood vessels, where intensity contrast may be low and reliable region statistics cannot be computed. To address this problem, we derive the gradient flows which maximize the rate of increase of flux of an appropriate vector field through a curve (in 2D) or a surface (in 3D). The key idea is to exploit the direction of the vector field along with its magnitude. The calculations lead to a simple and elegant interpretation which is essentially parameter free and has the same form in both dimensions. We illustrate its advantages with several level-set-based segmentations of 2D and 3D angiography images of blood vessels.
Although numerous studies have measured the strength of visual grouping cues for controlled psychophysical stimuli, little is known about the statistical utility of these various cues for natural images. In this study, we conducted experiments in which human participants trace perceived contours in natural images. These contours are automatically mapped to sequences of discrete tangent elements detected in the image. By examining relational properties between pairs of successive tangents on these traced curves, and between randomly selected pairs of tangents, we are able to estimate the likelihood distributions required to construct an optimal Bayesian model for contour grouping. We employed this novel methodology to investigate the inferential power of three classical Gestalt cues for contour grouping: proximity, good continuation, and luminance similarity. The study yielded a number of important results: (1) these cues, when appropriately defined, are approximately uncorrelated, suggesting a simple factorial model for statistical inference; (2) moderate image-to-image variation of the statistics indicates the utility of general probabilistic models for perceptual organization; (3) these cues differ greatly in their inferential power, proximity being by far the most powerful; and (4) statistical modeling of the proximity cue indicates a scale-invariant power law in close agreement with prior psychophysics.
In the current business environment in which companies are under increasing pressure not only to increase revenue but also to respond quickly to changing market conditions, companies will be successful only if they transform themselves and become on demand businesses. In this paper we describe the changes needed to effect this transformation, and in particular, we describe the important role played by componentization and by service orientation. We discuss the way componentization enables a business to operate in a value net, a network of partnerships with customers and suppliers supported by real-time information flows and information technology systems. We also describe the need for service orientation to achieve seamless integration of business components. We illustrate these ideas with a case study from the rental car business. Finally, we describe IBM activities in this area and the resulting methods and tools that help businesses deal with these challenges.
It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes.We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places }; lightweight activities embodied in async , future , foreach , and ateach constructs; a construct for termination detection ( finish ); the use of lock-free synchronization ( atomic blocks ); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java™ languages.
This paper introduces the concept of letting an RDBMS optimizer optimize its own environment. In our project, we have used the DB2 optimizer to tackle the index selection problem, a variation of the knapsack problem. This paper discusses our implementation of index recommendation, the user interface, and provide measurements on the quality of the recommended indexes.
In this paper we describe a framework for providing customers of Web services differentiated levels of service through the use of automated management and service level agreements (SLAs). The framework comprises the Web Service Level Agreement (WSLA) language, designed to specify SLAs in a flexible and individualized way, a system to provision resources based on service level objectives, a workload management system that prioritizes requests according to the associated SLAs, and a system to monitor compliance with the SLA. This framework was implemented as the utility computing services part of the IBM Emerging Technologies Tool Kit, which is publicly available on the IBM alphaWorks™ Web site.
Design patterns raise the abstraction level at which people design and communicate design of object-oriented software. However, the mechanics of implementing design patterns is left to the programmer. This paper describes the architecture and implementation of a tool that automates the implementation of design patterns. The user of the tool supplies application-specific information for a given pattern, from which the tool generates all the pattern-prescribed code automatically. The tool has a distributed architecture that lends itself to implementation with off-the-shelf components.
Established in 2005, YouTube has become the most successful Internet website providing a new generation of short video sharing service. Today, YouTube alone consumes as much bandwidth as did the entire Internet in year 2000 . Understanding the features of YouTube and similar video sharing sites is thus crucial to their sustainable development and to network traffic engineering. In this paper, using traces crawled in a 1.5-year span (from February 2007 to September 2008), we present an in-depth and systematic measurement study on the characteristics of YouTube videos. We find that YouTube videos have noticeably different statistics compared to traditional streaming videos, ranging from length, access pattern, to their active life span. The series of datasets also allow us to identify the growth trend of this fast evolving Internet site, which has seldom been explored before. We also look closely at the social networking aspect of YouTube, as this is a key driving force toward its success. In particular, we find that the links to related videos generated by uploaders' choices form a small-world network. This suggests that the videos have strong correlations with each other, and creates opportunities for developing novel caching and peer-to-peer distribution schemes to efficiently deliver videos to end users.
Virtually every commercial query optimizer chooses the best plan for a query using a cost model that relies heavily on accurate cardinality estimation. Cardinality estimation errors can occur due to the use of inaccurate statistics, invalid assumptions about attribute independence, parameter markers, and so on. Cardinality estimation errors may cause the optimizer to choose a sub-optimal plan. We present an approach to query processing that is extremely robust because it is able to detect and recover from cardinality estimation errors. We call this approach "progressive query optimization" (POP). POP validates cardinality estimates against actual values as measured during query execution. If there is significant disagreement between estimated and actual values, execution might be stopped and re-optimization might occur. Oscillation between optimization and execution steps can occur any number of times. A re-optimization step can exploit both the actual cardinality and partial results, computed during a previous execution step. Checkpoint operators (CHECK) validate the optimizer's cardinality estimates against actual cardinalities. Each CHECK has a condition that indicates the cardinality bounds within which a plan is valid. We compute this validity range through a novel sensitivity analysis of query plan operators. If the CHECK condition is violated, CHECK triggers re-optimization. POP has been prototyped in a leading commercial DBMS. An experimental evaluation of POP using TPC-H queries illustrates the robustness POP adds to query processing, while incurring only negligible overhead. A case-study applying POP to a real-world database and workload shows the potential of POP, accelerating complex OLAP queries by almost two orders of magnitude.
UML includes special extensibility mechanisms, which are used to define domain-specific modeling languages that are based on UML. These mechanisms have been significantly improved in the latest versions of UML. Unfortunately, there is currently a dearth of published material on how to best exploit these capabilities and, consequently, many UML profiles are either invalid or of poor quality. In this paper, we first provide an overview of the new extensibility mechanisms of UML 2.1 and then describe a method for defining profiles that greatly increases the likelihood of producing technically correct quality UML profiles
A pathfinder version of CHIME (the Canadian Hydrogen Intensity Mapping Experiment) is currently being commissioned at the Dominion Radio Astrophysical Observatory (DRAO) in Penticton, BC. The instrument is a hybrid cylindrical interferometer designed to measure the large scale neutral hydrogen power spectrum across the redshift range 0.8 to 2.5. The power spectrum will be used to measure the baryon acoustic oscillation (BAO) scale across this poorly probed redshift range where dark energy becomes a significant contributor to the evolution of the Universe. The instrument revives the cylinder design in radio astronomy with a wide field survey as a primary goal. Modern low-noise amplifiers and digital processing remove the necessity for the analog beam forming that characterized previous designs. The Pathfinder consists of two cylinders 37m long by 20m wide oriented north-south for a total collecting area of 1,500 square meters. The cylinders are stationary with no moving parts, and form a transit instrument with an instantaneous field of view of ~100 degrees by 1-2 degrees. Each CHIME Pathfinder cylinder has a feedline with 64 dual polarization feeds placed every ~30 cm which Nyquist sample the north-south sky over much of the frequency band. The signals from each dual-polarization feed are independently amplified, filtered to 400-800 MHz, and directly sampled at 800 MSps using 8 bits. The correlator is an FX design, where the Fourier transform channelization is performed in FPGAs, which are interfaced to a set of GPUs that compute the correlation matrix. The CHIME Pathfinder is a 1/10th scale prototype version of CHIME and is designed to detect the BAO feature and constrain the distance-redshift relation. The lessons learned from its implementation will be used to inform and improve the final CHIME design.
Legacy software systems are typically complex, geriatric, and difficult to change, having evolved over decades and having passed through many developers. Nevertheless, these systems are mature, heavily used, and constitute massive corporate assets. Migrating such systems to modern platforms is a significant challenge due to the loss of information over time. As a result, we embarked on a research project to design and implement an environment to support software migration. In particular, we focused on migrating legacy PL/I source code to C++, with an initial phase of looking at redocumentation strategies. Recent technologies such as reverse engineering tools and World Wide Web standards now make it possible to build tools that greatly simplify the process of redocumenting a legacy software system. In this paper we introduce the concept of a software bookshelf as a means to capture, organize, and manage information about a legacy software system. We distinguish three roles directly involved in the construction, population, and use of such a bookshelf: the builder, the librarian, and the patron. From these perspectives, we describe requirements for the bookshelf, as well as a generic architecture and a prototype implementation. We also discuss various parsing and analysis tools that were developed and integrated to assist in the recovery of useful information about a legacy system. In addition, we illustrate how a software bookshelf is populated with the information of a given software project and how the bookshelf can be used in a program-understanding scenario. Reported results are based on a pilot project that developed a prototype bookshelf for a software system consisting of approximately 300K lines of code written in a PL/I dialect.
Layered queues are a canonical form of extended queueing network for systems with nested multiple resource possession, in which successive depths of nesting define the layers. The model has been applied to most modern distributed systems, which use different kinds of client-server and master-slave relationships, and scales up well. The layered queueing network (LQN) model is described here in a unified fashion, including its many more extensions to match the semantics of sophisticated practical distributed and parallel systems. These include efficient representation of replicated services, parallel and quorum execution, and dependability analysis under failure and reconfiguration. The full LQN model is defined here and its solver is described. A substantial case study to an air traffic control system shows errors (compared to simulation) of a few percent. The LQN model is compared to other models and solutions, and is shown to cover all their features.
This paper describes an end-to-end system implementation of the transactional memory (TM) programming model on top of the hardware transactional memory (HTM) of the Blue Gene/Q (BG/Q) machine. The TM programming model supports most C/C++ programming constructs on top of a best-effort HTM with the help of a complete software stack including the compiler, the kernel, and the TM runtime.