Apple (Germany)
companyMunich, Germany
Research output, citation impact, and the most-cited recent papers from Apple (Germany) (Germany). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Apple (Germany)
Bilateral filtering smooths images while preserving edges, by means of a nonlinear combination of nearby image values. The method is noniterative, local, and simple. It combines gray levels or colors based on both their geometric closeness and their photometric similarity, and prefers near values to distant values in both domain and range. In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception. Also, in contrast with standard filtering, bilateral filtering produces no phantom colors along edges in color images, and reduces phantom colors where they appear in the original image.
Traditionally, virtual reality systems use 3D computer graphics to model and render virtual environments in real-time. This approach usually requires laborious modeling and expensive special purpose rendering hardware. The rendering quality and scene complexity are often limited because of the real-time constraint. This paper presents a new approach which uses 360-degree cylindrical panoramic images to compose a virtual environment. The panoramic image is digitally warped on-the-fly to simulate camera panning and zooming. The panoramic images can be created with computer rendering, specialized panoramic cameras or by "stitching" together overlapping photographs taken with a regular camera. Walking in a space is currently accomplished by "hopping" to different panoramic points. The image-based approach has been used in the commercial product QuickTime VR, a virtual reality extension to Apple Computer's QuickTime digital multimedia framework. The paper describes the architecture, the file format, the authoring process and the interactive players of the VR system. In addition to panoramic viewing, the system includes viewing of an object from different directions and hit-testing through orientation-independent hot spots.
In this article we propose a theoretical framework of distributed representations and a methodology of representational analysis for the study of distributed cognitive tasks—tasks that require the processing of information distributed across the internal mind and the external environment. The basic principle of distributed representations Is that the representational system of a distributed cognitive task is a set of internal and external representations, which together represent the abstract structure of the task. The basic strategy of representational analysis is to decompose the representation of a hierarchical task into its component levels so that the representational properties at each level can be independently examined. The theoretical framework and the methodology are used to analyze the hierarchical structure of the Tower of Hanoi problem. Based on this analysis, four experiments are designed to examine the representational properties of the Tower of Hanoi. Finally, the nature of external representations is discussed.
Simulating Computer provides an introduction to simulation for computer and communication-system designers who want to analyze the performance of their designs. In it MacDougall describes a discrete-event simulation language called smpl, discusses simulation modeling with smpl (using a variety of models as examples), describes the design of smpl, and presents a C language implementation.The book's first part introduces smpl simulation operations using a queueing network simulation model; addresses the development, verification, and validation of simulation models (including hybrid modeling and the use of analytic models in verification); and describes how to estimate the accuracy of simulation results. A multiprocessor system model and a CSMA/CD LAN model are studied in detail to emphasize the joint use of simulation and analytic models and to further illustrate the use of smpl. Projects for the reader include a CPU pipeline model and a token ring LAN model.The implementation of smpl is the focus of the book's second part, which describes the design of smpl function and data structures and outlines a variety of extensions. This description, together with the C source listing provided, will allow the reader to implement smpl on any system.M. H. MacDougall is with the Advanced Computer Group at Apple Computer, where he is involved in the design of high-performance personal computers for the 1990s. Simulating Computer is included in the Computer Systems series, edited by Herb Schwetman.
A description is given of the Xerox 8010 Star information system, which was designed as an office automation system. The idea was that professionals in a business or organization would have workstations on their desks and would use them to produce, retrieve, distribute, and organize documentation, presentations, memos, and reports. All of the workstations in an organization would be connected via Ethernet and would share access to file servers, printers, etc. The distinctive features of Star are identified, and changes to the original design are examined. A history of Star development is included. Some lessons learned from designing Star are related.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. To interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird's eye view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KITTI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR.
It is difficult to introduce both novice and experienced procedural programmers to the anthropomorphic perspective necessary for object-oriented design. We introduce CRC cards, which characterize objects by class name, responsibilities, and collaborators, as a way of giving learners a direct experience of objects. We have found this approach successful in teaching novice programmers the concepts of objects, and in introducing experienced programmers to complicated existing designs.
It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model's training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data. We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. Worryingly, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models.
Most ideas come from previous ideas. The sixties, particularly in the ARPA community, gave rise to a host of notions about “human-computer symbiosis” through interactive time-shared computers, graphics screens and pointing devices. Advanced computer languages were invented to simulate complex systems such as oil refineries and semi-intelligent behavior. The soon to follow paradigm shift of modern personal computing, overlapping window interfaces, and object-oriented design came from seeing the work of the sixties as something more than a “better old thing”. That is, more than a better way: to do mainframe computing; for end-users to invoke functionality; to make data structures more abstract. Instead the promise of exponential growth in computing/$/volume demanded that the sixties be regarded as “almost a new thing” and to find out what the actual “new things” might be. For example, one would compute with a handheld “Dynabook” in a way that would not be possible on a shared mainframe; millions of potential users meant that the user interface would have to become a learning environment along the lines of Montessori and Bruner; and needs for large scope, reduction in complexity, and end-user literacy would require that data and control structures be done away with in favor of a more biological scheme of protected universal cells interacting only through messages that could mimic any desired behavior.
Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world. To tackle this problem, we assume that variation across training domains is representative of the variation we might encounter at test time, but also that shifts at test time may be more extreme in magnitude. In particular, we show that reducing differences in risk across training domains can reduce a model's sensitivity to a wide range of extreme distributional shifts, including the challenging setting where the input contains both causal and anti-causal elements. We motivate this approach, Risk Extrapolation (REx), as a form of robust optimization over a perturbation set of extrapolated domains (MM-REx), and propose a penalty on the variance of training risks (V-REx) as a simpler variant. We prove that variants of REx can recover the causal mechanisms of the targets, while also providing some robustness to changes in the input distribution ("covariate shift"). By appropriately trading-off robustness to causally induced distributional shifts and covariate shift, REx is able to outperform alternative methods such as Invariant Risk Minimization in situations where these types of shift co-occur.
There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature. We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a self-averaging, interpolating algorithm which creates what we denote as a spiked-smooth classifier, and we view AdaBoost in the same light. We conjecture that both AdaBoost and random forests succeed because of this mechanism. We provide a number of examples to support this explanation. In the process, we question the conventional wisdom that suggests that boosting algorithms for classification require regularization or early stopping and should be limited to low complexity classes of learners, such as decision stumps. We conclude that boosting should be used like random forests: with large decision trees, without regularization or early stopping.
With recent progress in graphics, it has become more tractable to train models on synthetic images, potentially avoiding the need for expensive annotations. However, learning from synthetic images may not achieve the desired performance due to a gap between synthetic and real image distributions. To reduce this gap, we propose Simulated+Unsupervised (S+U) learning, where the task is to learn a model to improve the realism of a simulator's output using unlabeled real data, while preserving the annotation information from the simulator. We develop a method for S+U learning that uses an adversarial network similar to Generative Adversarial Networks (GANs), but with synthetic images as inputs instead of random vectors. We make several key modifications to the standard GAN algorithm to preserve annotations, avoid artifacts, and stabilize training: (i) a 'self-regularization' term, (ii) a local adversarial loss, and (iii) updating the discriminator using a history of refined images. We show that this enables generation of highly realistic images, which we demonstrate both qualitatively and with a user study. We quantitatively evaluate the generated images by training models for gaze estimation and hand pose estimation. We show a significant improvement over using synthetic images, and achieve state-of-the-art results on the MPIIGaze dataset without any labeled real data.
Most ideas come from previous ideas. The sixties, particularly in the ARPA community, gave rise to a host of notions about “human-computer symbiosis” through interactive time-shared computers, graphics screens and pointing devices. Advanced computer languages were invented to simulate complex systems such as oil refineries and semi-intelligent behavior. The soon to follow paradigm shift of modern personal computing, overlapping window interfaces, and object-oriented design came from seeing the work of the sixties as something more than a “better old thing”. That is, more than a better way: to do mainframe computing; for end-users to invoke functionality; to make data structures more abstract. Instead the promise of exponential growth in computing/$/volume demanded that the sixties be regarded as “ almost a new thing” and to find out what the actual “new things” might be. For example, one would compute with a handheld “Dynabook” in a way that would not be possible on a shared mainframe; millions of potential users meant that the user interface would have to become a learning environment along the lines of Montessori and Bruner; and needs for large scope, reduction in complexity, and end-user literacy would require that data and control structures be done away with in favor of a more biological scheme of protected universal cells interacting only through messages that could mimic any desired behavior. Early Smalltalk was the first complete realization of these new points of view as parented by its many predecessors in hardware, language and user interface design. It became the exemplar of the new computing, in part, because we were actually trying for a qualitative shift in belief structures—a new Kuhnian paradigm in the same spirit as the invention of the printing press—and thus took highly extreme positions which almost forced these new styles to be invented.
Jianmo Ni, Chen Qu, Jing Lu, Zhuyun Dai, Gustavo Hernandez Abrego, Ji Ma, Vincent Zhao, Yi Luan, Keith Hall, Ming-Wei Chang, Yinfei Yang. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.
Most ideas come from previous ideas. The sixties, particularly in the ARPA community, gave rise to a host of notions about "human-computer symbiosis" through interactive time-shared computers, graphics screens, and pointing devices. Advanced computer languages were invented to simulate complex systems such as oil refineries and semi-intelligent behavior. The soon to follow paradigm shift of modern personal computing, overlapping window interfaces, and object-oriented design came from seeing the work of the sixties as something more than a "better old thing." That is, more than a better way: to do mainframe computing; for end-users to invoke functionality; to make data structures more abstract. Instead the promise of exponential growth in computing/$/volume demanded that the sixties be regarded as "almost a new thing" and to find out what the actual "new things" might be. For example, one would compute with a handheld "Dynabook" in a way that would not be possible on a shared main-frame; millions of potential users meant that the user interface would have to become a learning environment along the lines of Montessori and Bruner; and needs for large scope, reduction in complexity, and end-user literacy would require that data and control structures be done away with in favor of a more biological scheme of protected universal cells interacting only through messages that could mimic any desired behavior.Early Smalltalk was the first complete realization of these new points of view as parented by its many predecessors in hardware, language, and user interface design. It became the exemplar of the new computing, in part, because we were actually trying for a qualitative shift in belief structures---a new Kuhnian paradigm in the same spirit as the invention of the printing press---and thus took highly extreme positions that almost forced these new styles to be invented.
This study examined smartphone user behaviors and their relation to self-reported smartphone addiction. Thirty-four users who did not own smartphones were given instrumented iPhones that logged all phone use over the course of the year-long study. At the conclusion of the study, users were asked to rate their level of addiction to the device. Sixty-two percent agreed or strongly agreed that they were addicted to their iPhones. These users showed differentiated smartphone use as compared to those users who did not indicate an addiction. Addicted users spent twice as much time on their phone and launched applications much more frequently (nearly twice as often) as compared to the non-addicted user. Mail, Messaging, Facebook and the Web drove this use. Surprisingly, games did not show any difference between addicted and non-addicted users. Addicted users showed significantly lower time-per-interaction than did non-addicted users for Mail, Facebook and Messaging applications. One addicted user reported that his addiction was problematic, and his use data was beyond three standard deviations from the upper hinge. This study provides unique insight, as use data in previous studies has been self-reported, and this study used a naturalistic, non-intervention data logging approach over the course of a year.
An algorithm for creating smooth spline surfaces over irregular meshes is presented. The algorithm is a generalization of quadratic B-splines; that is, if a mesh is (locally) regular, the resulting surface is equivalent to a B-spline. Otherwise, the resulting surface has a degree 3 or 4 parametric polynomial representation. A construction is given for representing the surface as a collection of tangent plane continuous triangular Be´zier patches. The algorithm is simple, efficient, and generates aesthetically pleasing shapes.
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood. Current evaluations primarily focus on established mathematical and coding benchmarks, emphasizing final answer accuracy. However, this evaluation paradigm often suffers from data contamination and does not provide insights into the reasoning traces’ structure and quality. In this work, we systematically investigate these gaps with the help of controllable puzzle environments that allow precise manipulation of compositional complexity while maintaining consistent logical structures. This setup enables the analysis of not only final answers but also the internal reasoning traces, offering insights into how LRMs “think”. Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counterintuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget. By comparing LRMs with their standard LLM counterparts under equivalent inference compute, we identify three performance regimes: (1) lowcomplexity tasks where standard models surprisingly outperform LRMs, (2) medium-complexity tasks where additional thinking in LRMs demonstrates advantage, and (3) high-complexity tasks where both models experience complete collapse. We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles. We also investigate the reasoning traces in more depth, studying the patterns of explored solutions and analyzing the models’ computational behavior, shedding light on their strengths, limitations, and ultimately raising crucial questions about their true reasoning capabilities.
INTRODUCTION: Accurate assessment of the corneal shape is important in cataract and refractive surgery, both in screening of candidates as well as for analyzing postoperative outcomes. Although corneal topography and tomography are widely used, it is common that these technologies are confused. The aim of this study was to present the current developments of these technologies and particularly distinguish between corneal topography and tomography. METHODS: The PubMed, Web of Science and Embase databases were the main resources used to investigate the medical literature. The following keywords were used in various combinations: cornea, corneal, topography, tomography, Scheimpflug, Pentacam, optical coherence tomography. RESULTS: Topography is the study of the shape of the corneal surface, while tomography allows a three-dimensional section of the cornea to be presented. Corneal topographers can be divided into large- and small-cone Placido-based devices, as well as devices with color-LEDs. For corneal tomography, scanning slit or Scheimpflug imaging and optical coherence tomography may be employed. In several devices, corneal topography and tomography have been successfully combined with tear-film analysis, aberrometry, optical biometry and anterior/posterior segment optical coherence tomography. CONCLUSION: There is a wide variety of imaging techniques to obtain corneal power maps. As different technologies are used, it is imperative that doctors involved in corneal surgery understand the science and clinical application of devices for corneal evaluation in depth.
PURPOSE: To optimize artificial intelligence (AI) algorithms to integrate Scheimpflug-based corneal tomography and biomechanics to enhance ectasia detection. DESIGN: Multicenter cross-sectional case-control retrospective study. METHODS: A total of 3886 unoperated eyes from 3412 patients had Pentacam and Corvis ST (Oculus Optikgeräte GmbH) examinations. The database included 1 eye randomly selected from 1680 normal patients (N) and from 1181 "bilateral" keratoconus (KC) patients, along with 551 normal topography eyes from patients with very asymmetric ectasia (VAE-NT), and their 474 unoperated ectatic (VAE-E) eyes. The current TBIv1 (tomographic-biomechanical index) was tested, and an optimized AI algorithm was developed for augmenting accuracy. RESULTS: The area under the receiver operating characteristic curve (AUC) of the TBIv1 for discriminating clinical ectasia (KC and VAE-E) was 0.999 (98.5% sensitivity; 98.6% specificity [cutoff: 0.5]), and for VAE-NT, 0.899 (76% sensitivity; 89.1% specificity [cutoff: 0.29]). A novel random forest algorithm (TBIv2), developed with 18 features in 156 trees using 10-fold cross-validation, had a significantly higher AUC (0.945; DeLong, P < .0001) for detecting VAE-NT (84.4% sensitivity and 90.1% specificity; cutoff: 0.43; DeLong, P < .0001) and a similar AUC for clinical ectasia (0.999; DeLong, P = .818; 98.7% sensitivity; 99.2% specificity [cutoff: 0.8]). Considering all cases, the TBIv2 had a higher AUC (0.985) than TBIv1 (0.974; DeLong, P < .0001). CONCLUSIONS: AI optimization to integrate Scheimpflug-based corneal tomography and biomechanical assessments augments accuracy for ectasia detection, characterizing ectasia susceptibility in the diverse VAE-NT group. Some patients with VAE may have true unilateral ectasia. Machine learning considering additional data, including epithelial thickness or other parameters from multimodal refractive imaging, will continuously enhance accuracy. NOTE: Publication of this article is sponsored by the American Ophthalmological Society.