Facility for Antiproton and Ion Research
facilityDarmstadt, Hesse, Germany
Research output, citation impact, and the most-cited recent papers from Facility for Antiproton and Ion Research (Germany). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Facility for Antiproton and Ion Research
The recently introduced panoptic segmentation task has renewed our community's interest in unifying the tasks of instance segmentation (for thing classes) and semantic segmentation (for stuff classes). However, current state-of-the-art methods for this joint task use separate and dissimilar networks for instance and semantic segmentation, without performing any shared computation. In this work, we aim to unify these methods at the architectural level, designing a single network for both tasks. Our approach is to endow Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. Surprisingly, this simple baseline not only remains effective for instance segmentation, but also yields a lightweight, top-performing method for semantic segmentation. In this work, we perform a detailed study of this minimally extended version of Mask R-CNN with FPN, which we refer to as Panoptic FPN, and show it is a robust and accurate baseline for both tasks. Given its effectiveness and conceptual simplicity, we hope our method can serve as a strong baseline and aid future research in panoptic segmentation.
Progress on object detection is enabled by datasets that focus the research community’s attention on open challenges. This process led us from simple images to complex scenes and from bounding boxes to segmentation masks. In this work, we introduce LVIS (pronounced ‘el-vis’): a new dataset for Large Vocabulary Instance Segmentation. We plan to collect 2.2 million high-quality instance segmentation masks for over 1000 entry-level object categories in 164k images. Due to the Zipfian distribution of categories in natural images, LVIS naturally has a long tail of categories with few training samples. Given that state-of-the-art deep learning methods for object detection perform poorly in the low-sample regime, we believe that our dataset poses an important and exciting new scientific challenge. LVIS is available at http://www.lvisdataset.org.
We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization. The results are no worse than their ImageNet pre-training counterparts even when using the hyper-parameters of the baseline system (Mask R-CNN) that were optimized for fine-tuning pre-trained models, with the sole exception of increasing the number of training iterations so the randomly initialized models may converge. Training from random initialization is surprisingly robust; our results hold even when: (i) using only 10% of the training data, (ii) for deeper and wider models, and (iii) for multiple tasks and metrics. Experiments show that ImageNet pre-training speeds up convergence early in training, but does not necessarily provide regularization or improve final target task accuracy. To push the envelope we demonstrate 50.9 AP on COCO object detection without using any external data-a result on par with the top COCO 2017 competition results that used ImageNet pre-training. These observations challenge the conventional wisdom of ImageNet pre-training for dependent tasks and we expect these discoveries will encourage people to rethink the current de facto paradigm of `pretraining and fine-tuning' in computer vision.
To understand the world, we humans constantly need to relate the present to the past, and put events in context. In this paper, we enable existing video models to do the same. We propose a long-term feature bank—supportive information extracted over the entire span of a video—to augment state-of-the-art video models that otherwise would only view short clips of 2-5 seconds. Our experiments demonstrate that augmenting 3D convolutional networks with a long-term feature bank yields state-of-the-art results on three challenging video datasets: AVA, EPIC-Kitchens, and Charades. Code is available online.
Sliding-window object detectors that generate bounding-box object predictions over a dense, regular grid have advanced rapidly and proven popular. In contrast, modern instance segmentation approaches are dominated by methods that first detect object bounding boxes, and then crop and segment these regions, as popularized by Mask R-CNN. In this work, we investigate the paradigm of dense sliding-window instance segmentation, which is surprisingly under-explored. Our core observation is that this task is fundamentally different than other dense prediction tasks such as semantic segmentation or bounding-box object detection, as the output at every spatial location is itself a geometric structure with its own spatial dimensions. To formalize this, we treat dense instance segmentation as a prediction task over 4D tensors and present a general framework called TensorMask that explicitly captures this geometry and enables novel operators on 4D tensors. We demonstrate that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN. These promising results suggest that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task. Code will be made available.
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments.Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we compare two waveform domain architectures. We first adapt Conv-Tasnet, initially developed for speech source separation,to the task of music source separation. While Conv-Tasnet beats many existing spectrogram-domain methods, it suffersfrom significant artifacts, as shown by human evaluations. We propose instead Demucs, a novel waveform-to-waveform model,with a U-Net structure and bidirectional LSTM.Experiments on the MusDB dataset show that, with proper data augmentation, Demucs beats allexisting state-of-the-art architectures, including Conv-Tasnet, with 6.3 SDR on average, (and up to 6.8 with 150 extra training songs, even surpassing the IRM oracle for the bass source).Using recent development in model quantization, Demucs can be compressed down to 120MBwithout any loss of accuracy.We also provide human evaluations, showing that Demucs benefit from a large advantagein terms of the naturalness of the audio. However, it suffers from some bleeding,especially between the vocals and other source.
People enjoy food photography because they appreciate food. Behind each meal there is a story described in a complex recipe and, unfortunately, by simply looking at a food image we do not have access to its preparation process. Therefore, in this paper we introduce an inverse cooking system that recreates cooking recipes given food images. Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by attending to both image and its inferred ingredients simultaneously. We extensively evaluate the whole system on the large-scale Recipe1M dataset and show that (1) we improve performance w.r.t. previous baselines for ingredient prediction; (2) we are able to obtain high quality recipes by leveraging both image and ingredients; (3) our system is able to produce more compelling recipes than retrieval-based approaches according to human judgment. We make code and models publicly available.
Video-Text Retrieval has been a hot research topic with the growth of multimedia data on the internet. Transformer for video-text learning has attracted increasing attention due to its promising performance. However, existing cross-modal transformer approaches typically suffer from two major limitations: 1) Exploitation of the transformer architecture where different layers have different feature characteristics is limited; 2) End-to-end training mechanism limits negative sample interactions in a mini-batch. In this paper, we propose a novel approach named Hierarchical Transformer (HiT) for video-text retrieval. HiT performs Hierarchical Cross-modal Contrastive Matching in both feature-level and semantic-level, achieving multi-view and comprehensive retrieval results. Moreover, inspired by MoCo, we propose Momentum Cross-modal Contrast for cross-modal learning to enable large-scale negative sample interactions on-the-fly, which contributes to the generation of more precise and discriminative representations. Experimental results on the three major Video-Text Retrieval benchmark datasets demonstrate the advantages of our method.
BACKGROUND: The current study was conducted to evaluate the long-term results of irradiation with carbon ions in a raster scanning technique in patients with skull base chordomas. METHODS: Between 1998 and 2008, a total of 155 patients (76 men and 79 women) with a median age of 48 years (range, 15 years-85 years) were irradiated with carbon ions using a raster scan technique. The irradiation was performed at the Society for Heavy Ion Research in Darmstadt, Germany. The median total dose was 60 gray (relative biological effectiveness) at 3 gray (relative biological effectiveness) per fraction. The median boost planning target volume was 70 mL (range, 2 mL-294 mL). Local control (LC) and overall survival (OS) were evaluated using the Kaplan-Meier method, whereas long-term toxicity was evaluated via questionnaires. RESULTS: The median follow-up was 72 months (range, 12 months-165 months). All patients had residual macroscopic tumors at the initiation of radiotherapy. The authors observed 55 local recurrences during follow-up, as well as systemic disease progression in 4 patients. The resulting 3-year, 5-year, and 10-year LC rates were 82%, 72%, and 54%, respectively, whereas the 3-year, 5-year, and 10-year OS rates were 95%, 85%, and 75%, respectively. Age <48 years and a boost volume >75 mL were associated with a significantly improved LC and OS. Primary treatment resulted in a significantly better OS probability. No higher late toxicity could be detected after carbon ion treatment. CONCLUSIONS: Carbon ion therapy appears to be a safe and effective treatment for patients with skull base chordoma, resulting in high LC and OS rates.
Transparent evaluations of FAIRness are increasingly required by a wide range of stakeholders, from scientists to publishers, funding agencies and policy makers. We propose a scalable, automatable framework to evaluate digital resources that encompasses measurable indicators, open source tools, and participation guidelines, which come together to accommodate domain relevant community-defined FAIR assessments. The components of the framework are: (1) Maturity Indicators - community-authored specifications that delimit a specific automatically-measurable FAIR behavior; (2) Compliance Tests - small Web apps that test digital resources against individual Maturity Indicators; and (3) the Evaluator, a Web application that registers, assembles, and applies community-relevant sets of Compliance Tests against a digital resource, and provides a detailed report about what a machine "sees" when it visits that resource. We discuss the technical and social considerations of FAIR assessments, and how this translates to our community-driven infrastructure. We then illustrate how the output of the Evaluator tool can serve as a roadmap to assist data stewards to incrementally and realistically improve the FAIRness of their resources.
Traditionally, tropospheric radical chemistry is discussed in terms of the daytime photochemically produced hydroxyl radical (OH). Radicals, however, are also important during nighttime: this is especially true for ozone and the nitrate radical (NO 3 ), which both act as key initiators of the degradation of alkenes such as biogenic monoterpenes. These reactions lead to the formation of peroxy radicals (HO 2 and RO 2 ) and hydroxyl radicals at night. We present recent observations of nighttime concentrations of NO 3 , RO 2 , HO 2 , and OH by differential optical absorption spectroscopy (DOAS), matrix isolation electron spin resonance (MIESR), laser‐induced fluorescence (LIF), and a chemical amplifier (CA) in the framework of the Berliner Ozonexperiment (BERLIOZ) campaign at Pabstthum, Germany, together with modeling studies of nocturnal radical chemistry. Modeled RO 2 mixing ratios reached 40 ppt while the measured RO x level went up to 22 ppt at the same time. Modeled and measured HO 2 mixing ratios were up to 6 and 4 ppt, respectively. In the case of OH, a nocturnal concentration of (1.85 ± 0.82) × 10 5 cm −3 was measured during one night. At this time, the model yielded an OH level of (4.1 ± 0.7) × 10 5 cm −3 . This overestimation by the model could point to a missing nocturnal sink of OH. Nitrate radical reactions with terpenes were found responsible for producing 77% of the RO 2 radicals, 53% of the HO 2 , and 36% of the OH radicals during night. Nighttime ozonolysis formed 12% of the RO 2 , 47% of the HO 2 , and 64% of the OH radicals. Another 11% of the RO 2 radicals were formed by OH–volatile organic compound (VOC) reactions. A positive linear correlation of RO 2 and NO 3 was observed and could be reproduced in model calculations originating from the loss of both radicals by reaction with NO and the NO 3 ‐initiated RO 2 production. The contribution of nighttime OH to the atmosphere's oxidation capacity (oxidation rate of VOCs, CO, and CH 4 ) was found negligible (<0.5%).
The INHAND Project (International Harmonization of Nomenclature and Diagnostic Criteria for Lesions in Rats and Mice) is a joint initiative of the Societies of Toxicologic Pathology from Europe (ESTP), Great Britain (BSTP), Japan (JSTP), and North America (STP) to develop an internationally accepted nomenclature for proliferative and nonproliferative changes in rats and mice. The purpose of this publication is to provide a standardized nomenclature for classifying changes observed in the hematolymphoid organs, including the bone marrow, thymus, spleen, lymph nodes, mucosa-associated lymphoid tissues, and other lymphoid tissues (serosa-associated lymphoid clusters and tertiary lymphoid structures) with color photomicrographs illustrating examples of the lesions. Sources of material included histopathology databases from government, academia, and industrial laboratories throughout the world. Content includes spontaneous lesions as well as lesions induced by exposure to test materials. The nomenclature for these organs is divided into 3 terminologies: descriptive, conventional, and enhanced. Three terms are listed for each diagnosis. The rationale for this approach and guidance for its application to toxicologic pathology are described in detail below.
We review the development of High Energy Density Physics (HEDP) with intense heavy ion beams as a tool to induce extreme states of matter. The development of this field connects intimately to the advances in accelerator physics and technology. We will cover the generation of intense heavy ion beams starting from the ion source and follow the acceleration process and transport to the target. Intensity limitations and potential solutions to overcome these limitations are discussed. This is exemplified by citing examples from existing machines at the Gesellschaft für Schwerionenforschung (GSI-Darmstadt), the Institute of Theoretical and Experimental Physics in Moscow (ITEP-Moscow), and the Institute of Modern Physics (IMP-Lanzhou). Facilities under construction like the FAIR facility in Darmstadt and the High Intensity Accelerator Facility (HIAF), proposed for China will be included. Developments elsewhere are covered where it seems appropriate along with a report of recent results and achievements.
The goal of MRI reconstruction is to restore a high fidelity image from partially observed measurements. This partial view naturally induces reconstruction uncertainty that can only be reduced by acquiring additional measurements. In this paper, we present a novel method for MRI reconstruction that, at inference time, dynamically selects the measurements to take and iteratively refines the prediction in order to best reduce the reconstruction error and, thus, its uncertainty. We validate our method on a large scale knee MRI dataset, as well as on ImageNet. Results show that (1) our system successfully outperforms active acquisition baselines; (2) our uncertainty estimates correlate with error maps; and (3) our ResNet-based architecture surpasses standard pixel-to-pixel models in the task of MRI reconstruction. The proposed method not only shows high-quality reconstructions but also paves the road towards more applicable solutions for accelerating MRI.
BACKGROUND: Cosmetic treatment of the forehead using neuromodulators is challenging. To avoid adverse events, the underlying anatomy has to be understood and thoughtfully targeted. Clinical observations indicate that eyebrow ptosis can be avoided if neuromodulators are injected in the upper forehead, despite the frontalis muscle being the primary elevator. METHODS: Twenty-seven healthy volunteers (11 men and 16 women) with a mean age of 37.5 ± 13.7 years (range, 22 to 73 years) and of diverse ethnicity (14 Caucasians, four African Americans, three Asians, and six of Middle Eastern descent) were enrolled. Skin displacement vector analyses were conducted on maximal frontalis muscle contraction to calculate magnitude and direction of forehead skin movement. RESULTS: In 100 percent of investigated volunteers, a bidirectional movement of the forehead skin was observed: the skin of the lower forehead moved cranially, whereas the skin of the upper forehead moved caudally. Both movements converged at a horizontal forehead line termed the line of convergence, or C-line. The position of the C-line relative to the total height of the forehead was 60.9 ± 10.2 percent in men and 60.6 ± 9.6 percent in women (p = 0.941). Independent of sex, the C-line was located at the second horizontal forehead line when counting from superior to inferior (men, n = 2; women, n = 2). No difference across ethnicities was detected. CONCLUSIONS: The identification of the C-line may potentially guide practitioners toward more predictable outcomes for forehead neuromodulator injections. Injections above the C-line could mitigate the risk of neuromodulator-induced brow ptosis.
We propose and study a task we name panoptic segmentation (PS). Panoptic segmentation unifies the typically distinct tasks of semantic segmentation (assign a class label to each pixel) and instance segmentation (detect and segment each object instance). The proposed task requires generating a coherent scene segmentation that is rich and complete, an important step toward real-world vision systems. While early work in computer vision addressed related image/scene parsing tasks, these are not currently popular, possibly due to lack of appropriate metrics or associated recognition challenges. To address this, we propose a novel panoptic quality (PQ) metric that captures performance for all classes (stuff and things) in an interpretable and unified manner. Using the proposed metric, we perform a rigorous study of both human and machine performance for PS on three existing datasets, revealing interesting insights about the task. The aim of our work is to revive the interest of the community in a more unified view of image segmentation. For more analysis and up-to-date results, please check the arXiv version of the paper: https://arxiv.org/abs/1801.00868.
Abstract The Facility for Antiproton and Ion Research (FAIR) in Darmstadt, Germany, provides unique possibilities for a new generation of hadron-, nuclear- and atomic physics experiments. The future antiProton ANnihilations at DArmstadt (PANDA or $$\overline{\mathrm{P}}$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mover><mml:mi>P</mml:mi><mml:mo>¯</mml:mo></mml:mover></mml:math> ANDA) experiment at FAIR will offer a broad physics programme, covering different aspects of the strong interaction. Understanding the latter in the non-perturbative regime remains one of the greatest challenges in contemporary physics. The antiproton–nucleon interaction studied with PANDA provides crucial tests in this area. Furthermore, the high-intensity, low-energy domain of PANDA allows for searches for physics beyond the Standard Model, e.g. through high precision symmetry tests. This paper takes into account a staged approach for the detector setup and for the delivered luminosity from the accelerator. The available detector setup at the time of the delivery of the first antiproton beams in the HESR storage ring is referred to as the Phase One setup. The physics programme that is achievable during Phase One is outlined in this paper.
We present our integration of post-quantum cryptography (PQC), more specifically of the post-quantum KEM scheme Kyber for key establishment and the post-quantum signature scheme SPHINCS+, into the embedded TLS library mbed TLS. We measure the performance of these post-quantum primitives on four different embedded platforms with three different ARM processors and an Xtensa LX6 processor. Furthermore, we compare the performance of our experimental PQC cipher suite to a classical TLS variant using elliptic curve cryptography (ECC). Post-quantum key establishment and signature schemes have been either integrated into TLS or ported to embedded devices before. However, to the best of our knowledge, we are the first to combine TLS, post-quantum schemes, and embedded systems and to measure and evaluate the performance of post-quantum TLS on embedded platforms. Our results show that post-quantum key establishment with Kyber performs well in TLS on embedded devices compared to ECC variants. The use of SPHINCS+ signatures comes with certain challenges in terms of signature size and signing time, which mainly affects the use of embedded systems as PQC-TLS server but does not necessarily prevent embedded systems to act as PQC-TLS clients.
While contrastive approaches of self-supervised learning (SSL) learn\nrepresentations by minimizing the distance between two augmented views of the\nsame data point (positive pairs) and maximizing views from different data\npoints (negative pairs), recent \\emph{non-contrastive} SSL (e.g., BYOL and\nSimSiam) show remarkable performance {\\it without} negative pairs, with an\nextra learnable predictor and a stop-gradient operation. A fundamental question\narises: why do these methods not collapse into trivial representations? We\nanswer this question via a simple theoretical study and propose a novel\napproach, DirectPred, that \\emph{directly} sets the linear predictor based on\nthe statistics of its inputs, without gradient training. On ImageNet, it\nperforms comparably with more complex two-layer non-linear predictors that\nemploy BatchNorm and outperforms a linear predictor by $2.5\\%$ in 300-epoch\ntraining (and $5\\%$ in 60-epoch). DirectPred is motivated by our theoretical\nstudy of the nonlinear learning dynamics of non-contrastive SSL in simple\nlinear networks. Our study yields conceptual insights into how non-contrastive\nSSL methods learn, how they avoid representational collapse, and how multiple\nfactors, like predictor networks, stop-gradients, exponential moving averages,\nand weight decay all come into play. Our simple theory recapitulates the\nresults of real-world ablation studies in both STL-10 and ImageNet. Code is\nreleased https://github.com/facebookresearch/luckmatters/tree/master/ssl.\n
Certain conditions require a delay in the coefficient update of the least mean square (LMS) and normalized least mean square (NLMS) algorithms. This paper presents an in-depth analysis of these modificated versions for the important case of spherically invariant random processes (SIRPs), which are known as an excellent model for speech signals. Some derived bounds and the predicted dynamic behavior of the algorithms are found to correspond very well to simulation results and a real time implementation on a fixed-point signal processor. A modification of the algorithm is proposed to assure the well known properties of the LMS and NLMS algorithms.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>