
National Institute of Informatics
facilityTokyo, Japan
Research output, citation impact, and the most-cited recent papers from National Institute of Informatics (Japan). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from National Institute of Informatics
Abstract Summary: MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252 Gbps in 44.1 and 99.6 h on a single computing node with and without a graphics processing unit, respectively. MEGAHIT assembles the data as a whole, i.e. no pre-processing like partitioning and normalization was needed. When compared with previous methods on assembling the soil data, MEGAHIT generated a three-time larger assembly, with longer contig N50 and average contig length; furthermore, 55.8% of the reads were aligned to the assembly, giving a fourfold improvement. Availability and implementation: The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license. Contact: rb@l3-bioinfo.com or twlam@cs.hku.hk Supplementary information: Supplementary data are available at Bioinformatics online.
Linear optics with photon counting is a prominent candidate for practical quantum computing. The protocol by Knill, Laflamme, and Milburn [2001, Nature (London) 409, 46] explicitly demonstrates that efficient scalable quantum computing with single photons, linear optical elements, and projective measurements is possible. Subsequently, several improvements on this protocol have started to bridge the gap between theoretical scalability and practical implementation. The original theory and its improvements are reviewed, and a few examples of experimental two-qubit gates are given. The use of realistic components, the errors they induce in the computation, and how these errors can be corrected is discussed.
One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. Our new normalization technique is computationally light and easy to incorporate into existing implementations. We tested the efficacy of spectral normalization on CIFAR10, STL-10, and ILSVRC2012 dataset, and we experimentally confirmed that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.
This paper presents a method to automatically and efficiently detect face tampering in videos, and particularly focuses on two recent techniques used to generate hyper-realistic forged videos: Deepfake and Face2Face. Traditional image forensics techniques are usually not well suited to videos due to the compression that strongly degrades the data. Thus, this paper follows a deep learning approach and presents two networks, both with a low number of layers to focus on the mesoscopic properties of images. We evaluate those fast networks on both an existing dataset and a dataset we have constituted from online videos. The tests demonstrate a very successful detection rate with more than 98% for Deepfake and 95% for Face2Face.
Lancelets (‘amphioxus’) are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic ∼520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution. This issue sees the publication of the draft genome sequence of an animal that has been studied by biologists for many years as a model for a primitive chordate. The amphioxus or lancelet is a small worm-like creature, usually to be found buried in sand on the sea floor. Comparative analysis of the genome of the Florida lancelet, Branchiostoma floridae, reveals 17 ancestral chordate linkage groups conserved in the modern amphioxus and vertebrate genomes despite more than half a billion years of independent evolution. From this it possible to make a virtual reconstruction of the 17 chromosomes of the last common chordate ancestor. This reconstruction conforms that two rounds of whole genome duplication have occurred during evolution of the jawed vertebrate lineage. And it illuminates the murky relationships between the three chordate groups, the tunicates, lancelets and vertebrates. The cover shows four adult amphioxus collected in Apalachee Bay, Florida, with anterior towards the top and dorsal to the right. Yellow ovals are gonads. (Photo by Nicholas Putnam, DOE Joint Genome Institute.
In the past decade, a two-dimensional matter-light system called the microcavity exciton-polariton has emerged as a new promising candidate of Bose-Einstein condensation (BEC) in solids. Many pieces of important evidence of polariton BEC have been established recently in GaAs and CdTe microcavities at the liquid helium temperature, opening a door to rich many-body physics inaccessible in experiments before. Technological progress also made polariton BEC at room temperatures promising. In parallel with experimental progresses, theoretical frameworks and numerical simulations are developed, and our understanding of the system has greatly advanced. In this article, recent experiments and corresponding theoretical pictures based on the Gross-Pitaevskii equations and the Boltzmann kinetic simulations for a finite-size BEC of polaritons are reviewed.
The medaka fish (Oryzias latipes) is a popular pet in Japan and more recently a laboratory model organism for developmental genetics and evolutionary biology. Now the medaka's genome has been sequenced and analysed by a large Japanese consortium. Cichlids and stickleback, which are emerging model systems for understanding the genetic basis of vertebrate speciation, are evolutionarily closer to medaka than zebrafish, so the medaka's genome sequence will yield valuable insights into 400 million years of vertebrate genome evolution. The medaka fish (Oryzias latipes) has long been a popular pet in Japan and more recently a laboratory model organism; it now has its genome sequenced and analysed by a Japanese consortium. Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats1. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published2,3, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination4,5,6 and developmental genetics7. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including ∼2,900 new genes, using 5′-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.
Ancient polyploidization events have shaped diverse eukaryotic genomes 1 , including two rounds of whole-genome duplication at the base of the vertebrate radiation 2 . While polyploidy is rare in amniotes, presumably owing to constraints on sex chromosome dosage Polyploidy provides raw material for evolutionary diversification because gene duplicates To explore the origins and consequences of tetraploidy in the African clawed frog, we sequenced the Xenopus laevis genome and compared it to the related diploid X. tropicalis genome. We characterize the allotetraploid origin of X. laevis by partitioning its genome into two homoeologous subgenomes, marked by distinct families of 'fossil' transposable elements. On the basis of the activity of these elements and the age of hundreds of unitary pseudogenes, we estimate that the two diploid progenitor species diverged around 34 million years ago (Ma) and combined to form an allotetraploid around 17-18 Ma. More than 56% of all genes were retained in two homoeologous copies. Protein function, gene expression, and the amount of conserved flanking sequence all correlate with retention rates. The subgenomes have evolved asymmetrically, with one chromosome set more often preserving the ancestral state and the other experiencing more gene loss, deletion, rearrangement, and reduced gene expression.
We describe a generalization of the cluster-state model of quantum computation to continuous-variable systems, along with a proposal for an optical implementation using squeezed-light sources, linear optics, and homodyne detection. For universal quantum computation, a nonlinear element is required. This can be satisfied by adding to the toolbox any single-mode non-Gaussian measurement, while the initial cluster state itself remains Gaussian. Homodyne detection alone suffices to perform an arbitrary multimode Gaussian transformation via the cluster state. We also propose an experiment to demonstrate cluster-based error reduction when implementing Gaussian operations.
This paper describes a framework that serves as a reference for classifying user interfaces supporting multiple targets, or multiple contexts of use in the field of context-aware computing. In this framework, a context of use is decomposed into three facets: the end users of the interactive system, the hardware and software computing platform with which the users have to carry out their interactive tasks and the physical environment where they are working. Therefore, a context-sensitive user interface is a user interface that exhibits some capability to be aware of the context (context awareness) and to react to changes of this context. This paper attempts to provide a unified understanding of context-sensitive user interfaces rather than a prescription of various ways or methods of tackling different steps of development. Rather, the framework structures the development life cycle into four levels of abstraction: task and concepts, abstract user interface, concrete user interface and final user interface. These levels are structured with a relationship of reification going from an abstract level to a concrete one and a relationship of abstraction going from a concrete level to an abstract one. Most methods and tools can be more clearly understood and compared relative to each other against the levels of this framework. In addition, the framework expresses when, where and how a change of context is considered and supported in the context-sensitive user interface thanks to a relationship of translation. In the field of multi-target user interfaces is also introduced, defined, and exemplified the notion of plastic user interfaces. These user interfaces support some adaptation to changes of the context of use while preserving a predefined set of usability properties.
Recent advances in media generation techniques have made it easier for attackers to create forged images and videos. State-of-the-art methods enable the real-time creation of a forged version of a single video obtained from a social network. Although numerous methods have been developed for detecting forged images and videos, they are generally targeted at certain domains and quickly become obsolete as new kinds of attacks appear. The method introduced in this paper uses a capsule network to detect various kinds of spoofs, from replay attacks using printed images or recorded videos to computer-generated videos using deep convolutional neural networks. It extends the application of capsule networks beyond their original intention to the solving of inverse graphics problems.
High-temperature superconductivity often emerges in the proximity of a symmetry-breaking ground state. For superconducting iron arsenides, in addition to the antiferromagnetic ground state, a small structural distortion breaks the crystal's C(4 )rotational symmetry in the underdoped part of the phase diagram. We reveal that the representative iron arsenide Ba(Fe(1)(-x)Co(x))(2)As(2) develops a large electronic anisotropy at this transition via measurements of the in-plane resistivity of detwinned single crystals, with the resistivity along the shorter b axis rho(b) being greater than rho(a). The anisotropy reaches a maximum value of ~2 for compositions in the neighborhood of the beginning of the superconducting dome. For temperatures well above the structural transition, uniaxial stress induces a resistivity anisotropy, indicating a substantial nematic susceptibility.
The analysis and optimization of complex systems can be reduced to mathematical problems collectively known as combinatorial optimization. Many such problems can be mapped onto ground-state search problems of the Ising model, and various artificial spin systems are now emerging as promising approaches. However, physical Ising machines have suffered from limited numbers of spin-spin couplings because of implementations based on localized spins, resulting in severe scalability problems. We report a 2000-spin network with all-to-all spin-spin couplings. Using a measurement and feedback scheme, we coupled time-multiplexed degenerate optical parametric oscillators to implement maximum cut problems on arbitrary graph topologies with up to 2000 nodes. Our coherent Ising machine outperformed simulated annealing in terms of accuracy and computation time for a 2000-node complete graph.
Quantum error correction (QEC) and fault-tolerant quantum computation represent one of the most vital theoretical aspects of quantum information processing. It was well known from the early developments of this exciting field that the fragility of coherent quantum systems would be a catastrophic obstacle to the development of large-scale quantum computers. The introduction of quantum error correction in 1995 showed that active techniques could be employed to mitigate this fatal problem. However, quantum error correction and fault-tolerant computation is now a much larger field and many new codes, techniques, and methodologies have been developed to implement error correction for large-scale quantum algorithms. In response, we have attempted to summarize the basic aspects of quantum error correction and fault-tolerance, not as a detailed guide, but rather as a basic introduction. The development in this area has been so pronounced that many in the field of quantum information, specifically researchers who are new to quantum information or people focused on the many other important issues in quantum computation, have found it difficult to keep up with the general formalisms and methodologies employed in this area. Rather than introducing these concepts from a rigorous mathematical and computer science framework, we instead examine error correction and fault-tolerance largely through detailed examples, which are more relevant to experimentalists today and in the near future.
Background: Moyamoya disease is an idiopathic vascular disorder of intracranial arteries. Its susceptibility locus has been mapped to 17q25.3 in Japanese families, but the susceptibility gene is unknown.
Taking the pulse of optimization Finding the optimum solution of multiparameter or multifunctional problems is important across many disciplines, but it can be computationally intensive. Many such problems defined as computationally difficult can be mathematically mapped onto the so-called Ising problem, which looks at finding the minimum energy configuration for an array of coupled spins. Inagaki et al. and McMahon et al. show that an optical processing approach based on a network of coupled optical pulses in a ring fiber can be used to model and optimize large-scale Ising systems. Such a scalable architecture could help to optimize solutions to a wide range of complex problems. Science , this issue pp. 603 and 614
To improve the quality of computation experience for mobile devices, mobile-edge computing (MEC) is a promising paradigm by providing computing capabilities in close proximity within a sliced radio access network (RAN), which supports both traditional communication and MEC services. Nevertheless, the design of computation offloading policies for a virtual MEC system remains challenging. Specifically, whether to execute a computation task at the mobile device or to offload it for MEC server execution should adapt to the time-varying network dynamics. This paper considers MEC for a representative mobile user in an ultradense sliced RAN, where multiple base stations (BSs) are available to be selected for computation offloading. The problem of solving an optimal computation offloading policy is modeled as a Markov decision process, where our objective is to maximize the long-term utility performance whereby an offloading decision is made based on the task queue state, the energy queue state as well as the channel qualities between mobile user and BSs. To break the curse of high dimensionality in state space, we first propose a double deep Q-network (DQN)-based strategic computation offloading algorithm to learn the optimal policy without knowing a priori knowledge of network dynamics. Then motivated by the additive structure of the utility function, a Q-function decomposition technique is combined with the double DQN, which leads to a novel learning algorithm for the solving of stochastic computation offloading. Numerical experiments show that our proposed learning algorithms achieve a significant improvement in computation offloading performance compared with the baseline policies.
We show how to construct a near deterministic CNOT gate using several single photons sources, linear optics, photon number resolving quantum nondemolition detectors, and feed forward. This gate does not require the use of massively entangled states common to other implementations and is very efficient on resources with only one ancilla photon required. The key element of this gate is nondemolition detectors that use a weak cross-Kerr nonlinearity effect to conditionally generate a phase shift on a coherent probe if a photon is present in the signal mode. These potential phase shifts can then be measured using highly efficient homodyne detection.
Significant progress has been made with deep neural networks recently. Sharing trained models of deep neural networks has been a very important in the rapid progress of research and development of these systems. At the same time, it is necessary to protect the rights to shared trained models. To this end, we propose to use digital watermarking technology to protect intellectual property and detect intellectual property infringement in the use of trained models. First, we formulate a new problem: embedding watermarks into deep neural networks. Second, we propose a general framework for embedding a watermark in model parameters, using a parameter regularizer. Our approach does not impair the performance of networks into which a watermark is placed because the watermark is embedded while training the host network. Finally, we perform comprehensive experiments to reveal the potential of watermarking deep neural networks as the basis of this new research effort. We show that our framework can embed a watermark during the training of a deep neural network from scratch, and during fine-tuning and distilling, without impairing its performance. The embedded watermark does not disappear even after fine-tuning or parameter pruning; the watermark remains complete even after 65% of parameters are pruned.
ASVspoof, now in its third edition, is a series of community-led challenges\nwhich promote the development of countermeasures to protect automatic speaker\nverification (ASV) from the threat of spoofing. Advances in the 2019 edition\ninclude: (i) a consideration of both logical access (LA) and physical access\n(PA) scenarios and the three major forms of spoofing attack, namely synthetic,\nconverted and replayed speech; (ii) spoofing attacks generated with\nstate-of-the-art neural acoustic and waveform models; (iii) an improved,\ncontrolled simulation of replay attacks; (iv) use of the tandem detection cost\nfunction (t-DCF) that reflects the impact of both spoofing and countermeasures\nupon ASV reliability. Even if ASV remains the core focus, in retaining the\nequal error rate (EER) as a secondary metric, ASYspoof also embraces the\ngrowing importance of fake audio detection. ASVspoof 2019 attracted the\nparticipation of 63 research teams, with more than half of these reporting\nsystems that improve upon the performance of two baseline spoofing\ncountermeasures. This paper describes the 2019 database, protocols and\nchallenge results. It also outlines major findings which demonstrate the real\nprogress made in protecting against the threat of spoofing and fake audio.\n