National Institute for Japanese Language and Linguistics
otherTokyo, Japan
Research output, citation impact, and the most-cited recent papers from National Institute for Japanese Language and Linguistics (Japan). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from National Institute for Japanese Language and Linguistics
This paper presents a deep-learning method for distinguishing computer generated graphics from real photographic images. The proposed method uses a Convolutional Neural Network (CNN) with a custom pooling layer to optimize current best-performing algorithms feature extraction scheme. Local estimates of class probabilities are computed and aggregated to predict the label of the whole picture. We evaluate our work on recent photo-realistic computer graphics and show that it outperforms state of the art methods for both local and full image classification.
The balanced corpus of contemporary written Japanese (BCCWJ) is Japan’s first 100 million words balanced corpus. It consists of three subcorpora (publication subcorpus, library subcorpus, and special-purpose subcorpus) and covers a wide range of text registers including books in general, magazines, newspapers, governmental white papers, best-selling books, an internet bulletin-board, a blog, school textbooks, minutes of the national diet, publicity newsletters of local governments, laws, and poetry verses. A random sampling technique is utilized whenever possible in order to maximize the representativeness of the corpus. The corpus is annotated in terms of dual POS analysis, document structure, and bibliographical information. The BCCWJ is currently accessible in three different ways including Chunagon a web-based interface to the dual POS analysis data. Lastly, results of some pilot evaluation of the corpus with respect to the textual diversity are reported. The analyses include POS distribution, word-class distribution, entropy of orthography, sentence length, and variation of the adjective predicate. High textual diversity is observed in all these analyses.
This paper examines the role of retracted tongue root ([RTR]) harmony in Northeast Asian areal and genetic relationships. Recent research has suggested that at least three of the families grouped together as Altaic by Poppe (1960) – Korean, Mongolic, and Tungusic (KMT) – should be reconstructed with [RTR] vowel harmony. In this paper we reinforce this conclusion, arguing specifically against proposals that [RTR] harmony is secondary, or that [ATR] is the dominant feature. We also argue against the proposal of Starostin et al. (2003) that specific proto-families such as proto-Tungusic should be reconstructed without vowel harmony. We then compare the status of [RTR] harmony in Northeast Asia to the status of tongue root harmony in the Central Sudanic Zone, extending our discussion to the vowel harmony found in Chukchi, Yukaghir, Nivkh, and Ainu. We discuss whether KMT-style [RTR] harmony should be viewed as an innovation or a retention, and examine the particular issue of the Korean vowel inventory.
This article explains that Japanese uses a variety of prosodic mechanisms to mark focal prominence, including local pitch range expansion, prosodic restructuring to set off the focal constituent, postfocal subordination, and prominence-lending boundary pitch movements, but (notably) not manipulation of accent. It also discusses the Japanese intonation system within the Autosegmental-Metrical (AM) model of intonational phonology. The article then describes four phenomena that are the locus of lively discussion and controversy in the further development of this AM framework account, before addressing the larger implications which these phenomena have for the development of a tenable general theory of the role of prosody in the marking of discourse prominence. There is a rich variety of prominence-marking mechanisms even when morphosyntactic mechanisms such as scrambling are ignored. The generalization across English and Japanese languages predicts that there should also be complementary patterns of focus projection within the VP in transitive clauses.
Superconducting fault current limiters for electric power systems have been researched. A magnetic shielding type superconducting fault current limiter is developed in the authors' research on superconducting fault current limiters. This limiter consists of a copper primary winding, a superconducting cylinder, an iron core and a control coil. The superconducting cylinder has a Bi2212 thick film on a MgO substrate. The control coil consists of some metallic rings, and the fault level can be adjusted by changing the number of the rings. To design a prototype limiter, the AC magnetic shielding and loss characteristics of small models were measured. The prototype limiter is 6600 V in rated voltage and 400 A in rated current. The superconducting cylinder is 0.45 m in diameter and about 1 m in height. Only the superconducting cylinder was designed to be cooled by liquid nitrogen. The experimentally manufactured limiter is about 1.3 m in width, about 0.6 m in depth and about 2 m in height.
As Internet users increasingly post images to express their daily sentiment and emotions, the analysis of sentiments in user-generated images is of increasing importance for developing several applications. Most conventional methods of image sentiment analysis focus on the design of visual features, and the use of text associated to the images has not been sufficiently investigated. This paper proposes a novel approach that exploits latent correlations among multiple views: visual and textual views, and a sentiment view constructed using SentiWordNet. In the proposed method, we find a latent embedding space in which correlations among the three views are maximized. The projected features in the latent space are used to train a sentiment classifier, which considers the complementary information from different views. Results of experiments conducted on Flickr and Instagram images show that our approach achieves better sentiment classification accuracy than methods that use a single modality only and the state-of-the art method that jointly uses multiple modalities.
<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> We clarified that the common necessary condition for generating a dynamic gait results from the requirement to restore mechanical energy through studies on passive dynamic walking mechanisms. This paper proposes a novel method of generating a dynamic gait that can be found in the mechanism of a swing inspired by the principle of parametric excitation using telescopic leg actuation. We first introduce a simple underactuated biped model with telescopic legs and semicircular feet and propose a law to control the telescopic leg motion. We found that a high-speed dynamic bipedal gait can easily be generated by only pumping the swing leg mass. We then conducted parametric studies by adjusting the control and physical parameters and determined how well the basic gait performed by introducing some performance indexes. Improvements in energy efficiency by using an elastic-element effect were also numerically investigated. Further, we theoretically proved that semicircular feet have a mechanism that decreases the energy dissipated by heel-strike collisions. We provide insights throughout this paper into how zero-moment-point-free robots can generate a novel biped gait. </para>
This paper presents a case study of sound symbolism, cases in which certain sounds tend to be associated with particular meanings. We used the corpus of all Japanese Pokémon names available as of October 2016. We tested the effects of voiced obstruents, mora counts, and vowel quality on Pokémon characters' size, weight, strength parameters, and evolution levels. We found that the number of voiced obstruents in Pokémon names correlates positively with size, weight, evolution levels, and general strength parameters, except for speed. We argue that this result is compatible with the frequency code hypothesis of Ohala. The number of moras in Pokémon names correlates positively with size, weight, evolution levels, and all strength parameters. Vowel height is also shown to have an influence on size and weight - Pokémon characters with initial high vowels tend to be smaller and lighter, although the effect size is not very large. Not only does this paper offer a new case study of sound symbolism, it provides evidence that sound symbolism is at work when naming proper nouns.
Abstract There are productive (- re/-e/-te, -yar/-ar ) and non-productive (- V, -ke, -ka ) causatives in Ainu. Non-productive causatives have traditionally been regarded as ‘transitives’, but proposing a revision of Tamura’s model of verbal structure in Ainu, this paper argues that they can be regarded as direct causatives, though causatives in -V do not have the same derivational status and should rather be regarded as lexical causatives. A cross-dialectal comparison shows that the causative function of -ka is gradually being replaced by the productive causative -re/-e/-te which came to be used as a default causative marker of direct/indirect causation. Three of five causative morphemes originated in the verbs ‘make’ and ‘do’; all of them have also been attested with the denominal causative function, which suggests the following general pattern for the development of causatives in Ainu: do/make>denom>dir caus>(indir caus) .
In general, algorithms for order-3 CANDECOMP/ PARAFAC (CP), also coined canonical polyadic decomposition (CPD), are easy to implement and can be extended to higher order CPD. Unfortunately, the algorithms become computationally demanding, and they are often not applicable to higher order and relatively large scale tensors. In this paper, by exploiting the uniqueness of CPD and the relation of a tensor in Kruskal form and its unfolded tensor, we propose a fast approach to deal with this problem. Instead of directly factorizing the high order data tensor, the method decomposes an unfolded tensor with lower order, e.g., order-3 tensor. On the basis of the order-3 estimated tensor, a structured Kruskal tensor, of the same dimension as the data tensor, is then generated, and decomposed to find the final solution using fast algorithms for the structured CPD. In addition, strategies to unfold tensors are suggested and practically verified in the paper.
The present study examined cerebral representations of Japanese long and short vowel categories with near-infrared spectroscopy (NIRS) by measuring the hemodynamic changes. Results showed that NIRS could capture phoneme-specific information. The left side of the auditory area showed large hemodynamic changes only for contrasting stimuli between which the phonemic boundary was estimated, but not for stimuli differing by an equal duration but belonging to the same phoneme category. Left dominance in phoneme processing was also confirmed for the across-category stimuli. These findings indicate that the Japanese vowel contrast based only on duration differences is dealt with in the same language-dominant hemisphere as other spectrally varying phonemic categories, and that the cortical activities related to its processing can be detected with NIRS.
This paper proposes a new efficient multichannel nonnegative matrix factorization (NMF) method. Recently, multichannel NMF (MNMF) has been proposed as a means of solving the blind source separation problem. This method estimates a mixing system of sources and attempts to separate them in a blind fashion. However, this method is strongly dependent on its initial values because there are no constraints in the spatial models. To solve this problem, we introduce a rank-1 spatial model into MNMF. The proposed method estimates a demixing matrix while representing sources using NMF bases and can be optimized by the update rules of independent vector analysis and conventional single-channel NMF. Experimental results show the efficacy of the proposed method in terms of robustness and convergence speed.
We describe a time-of-arrival-(ToA-) based localization system for smartphones. In this system, the transmitter emits modulated light-emitting diode (LED) light and sound waves, then the smartphone catches them. The smartphone measures the time of flight of sound waves and localizes its position using multilateration. The LED light is used for visible light communication, which also conveys the time reference of the sound emission. Using the time reference, we can synchronize between the transmitter and the receiver, then the ToA-based localization can be available. The precision of time synchronization is the key factor for localization based on ToA. Hence, we have proposed the SyncSync method using a modulated LED light and a smartphone video camera, which enables ToA localization by measuring the time of flight of sound waves. This method gives better results than those of time-difference-of-arrival localization. However, we had to use a dedicated light synchronization device for our method. Visible light communication (VLC) is becoming a popular application of smartphones. If VLC demodulation could be used for time synchronization in acoustic localization, VLC and indoor localization would be integrated into a single application. In this paper, we examined the feasibility of VLC time synchronization for localization. Then, ToA-based localization was performed using a smartphone application. The standard deviation of the 3D localization was around 100 mm in a dark room, which is sufficiently precise for practical applications.
Face sketch synthesis is a crucial issue in digital entertainment and law enforcement. It can bridge the considerable texture discrepancy between face photos and sketches. Most of the current face sketch synthesis approaches directly to learn the relationship between the photos and sketches, and it is very difficult for them to generate the individual specific features, which we call rare characteristics. In this paper, we propose a novel face sketch synthesis approach through residual learning. In contrast to traditional approaches, which aim to reconstruct a sketch image directly (i.e., learn the mapping relationship between the photo and sketch), we aim to predict the residual image by learning the mapping relationship between the photo and residual, i.e., the difference between the photo and sketch, given an observed photo. This technique will render optimizing the residual mapping easier than optimizing the original mapping and deriving rare characteristic information. We also introduce a joint dictionary learning algorithm by preserving the local geometry structure of a data space. Through the learned joint dictionary, we transform the face sketch synthesis from an image space to a new and compact space; the new and compact space is spanned by learned dictionary atoms, where the manifold assumption can be further guaranteed. Results show that the proposed method demonstrates an impressive performance in the face sketch synthesis task on three public face sketch datasets and various real-world photos. These results are derived by comparing the proposed method with several state-of-the-art techniques, including certain recently proposed deep learning-based approaches.
This chapter describes the prosodic systems of Japanese and Korean, the two major languages spoken in the Asian Pacific Rim. It covers both lexical and post-lexical prosody of these languages, with the main focus on word accent and intonation. As for word accent, Japanese and Korean both exhibit various pitch accent systems ranging from those with multiple accent patterns to those in which pitch plays no distinctive role at the lexical level. The intonation systems of these two languages are also diverse, ranging from those with purely phrasal or boundary tones to those where intonational tones are constrained by lexical pitch accent patterns. However, both languages have an accentual phrase as the smallest prosodic unit defined by intonation. Similarities and differences between the prosodic systems of these two languages, including various regional dialects in each language, are analysed in the framework of the autosegmental-metrical model of intonational phonology.
Service composition provides a means of customized and flexible integration of service functionalities. Quality-of-service (QoS) optimization algorithms select services to adapt workflows to the non-functional requirements of the user. With increasing number of services in a workflow, previous approaches fail to achieve a sufficient reliability. Moreover, expensive ad-hoc replanning is required to deal with service failures. The major problem with such sequential application of planning and replanning is that it ignores the potential costs during the initial planning and they consequently are hidden from the decision maker. Our idea to overcome this problem is to compute a QoS optimized selection of service clusters that includes a sufficient number of backup services for each service employed. These backup services should be sufficiently distributed to prevent a task failure in case of, e.g., a network failure. To support the decision maker in the selection task, our multi-objective approach considers the possible repair costs directly in the initial composition. Our graphical user interface visualizes the resulting QoS of the workflow and the location of the services to enable the decision maker to select compositions in line with risk preferences. We prove the benefits of our approach in our detailed evaluation.
This paper discusses the phonology of tone in Koshikijima Japanese, an endangered dialect spoken on the Koshikijima Islands in the south of Japan. Particularly interesting is the relationship between word-level and sentence-level phonology, as well as the interaction between the mora and the syllable.
Abstract The languages of Northeast Asia show evidence of dispersal from south to north, consistent with the hypothesis that agriculture spread north and east from the vicinity of Liaoning, beginning with the millets approximately 5500 BP. Wet rice agriculture in Korea and Japan results from a later spread, also beginning in Shandong, crossing via the Liaodong peninsula and reaching the Korean peninsula around 1500 BCE. This dispersal is associated with the Mumun archaeological culture after 1500 BCE in the Korean peninsula and the Yayoi culture after 950 BCE in the Japanese archipelago. From a linguistic standpoint, it is associated with the entry of the Japonic language family, first into the Korean peninsula, subsequently into the Japanese archipelago. The arrival of Koreanic is associated with the advent of the Korean-style bronze dagger culture and a temporary hiatus in wet rice agriculture sites around 300 BCE. Both Koreanic and Japonic are relatively shallow language families, with Koreanic the shallower of the two, consistent with the chronology above. The gap between the earliest linguistically motivated dates for these language families and the archaeological events is the result of a linguistic founders effect, providing further evidence for demic diffusion as a source for their distribution.
Computer-based automatically generated text is used in various applications (e.g., text summarization, machine translation) and has come to play an important role in daily life. However, computer-generated text may produce confusing information due to translation errors and inappropriate wording caused by faulty language processing, which could be a critical issue in presidential elections and product advertisements. Previous methods for detecting computer-generated text typically estimate text fluency, but this may not be useful in the near future due to the development of neural-network-based natural language generation that produces wording close to human-crafted wording. A different approach to detecting computergenerated text is thus needed. We hypothesize that human-crafted wording is more consistent than that of a computer. For instance, Zipf's law states that the most frequent word in human-written text has approximately twice the frequency of the second most frequent word, nearly three times that of the third most frequent word, and so on. We found that this is not true in the case of computer-generated text. We hence propose a method to identify computer-generated text on the basis of statistics. First, the word distribution frequencies are compared with the corresponding Zipfian distributions to extract the frequency features. Next, complex phrase features are extracted because human-generated text contains more complex phrases than computer-generated text. Finally, the higher consistency of the human-generated text is quantified at both the sentence level using phrasal verbs and at the paragraph level using coreference resolution relationships, which are integrated into consistency features. The combination of the frequencies, the complex phrases, and the consistency features was evaluated for 100 English books written originally in English and 100 English books translated from Finnish. The results show that our method achieves better performance (accuracy = 98.0%; equal error rate = 2.9%) compared with the most suitable method for books using parsing tree feature extraction. Evaluation using two other languages (French and Dutch) showed similar results. The proposed method thus works consistently in various languages.
This paper proposes a backup resource allocation model that provides a probabilistic protection for primary physical machines in a cloud provider to minimize the required total capacity. When any random failure occurs, workloads are transferred to preplanned and dedicated backup physical machines for prompt recovery. In the proposed model, a probabilistic protection guarantee is introduced to prevent the cloud provider from capacity overbooking. We apply robust optimization in our model to formulate the backup resource allocation problem as an integer linear programming problem. A simulated annealing heuristic is adopted to solve the same optimization problem when the cloud provider is large. Finally, the results reveal that the required backup capacity depends on the reliability of primary physical machines. Specifically, the more the resources in primary physical machines share backup capacity when the failure probabilities of primary physical machines are sufficiently small, the less capacity is required for backup resource allocation.