Institute of Linguistics
facilityBeijing, China
Research output, citation impact, and the most-cited recent papers from Institute of Linguistics (China). Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Institute of Linguistics
The temporal properties of speech appear to play a more important role in linguistic contrasts than has hitherto been appreciated. Therefore, a new framework for describing the acoustic structure of speech based purely on temporal aspects has been developed. From this point of view, speech can be said to be comprised of three main temporal features, based on dominant fluctuation rates: envelope, periodicity, and fine-structure. Each feature has distinct acoustic manifestations, auditory and perceptual correlates, and roles in linguistic contrasts. The applicability of this three-featured temporal system is discussed in relation to hearing-impaired and normal listeners.
This paper discusses the inseparability of culture and language, presents three new metaphors relating to culture and language, and explores cultural content in specific language items through a survey of word associations. The survey was designed for native Chinese speakers (NCS) in Chinese, as well as for native English speakers (NES) in English (see Appendix). The words and expressions associated by NCS convey Chinese culture, and those associated by NES convey English culture. The intimate relationship between language and culture is strikingly illustrated by the survey, which confirms the view that language and culture cannot exist without each other.
An individual's words often reveal their political ideology. Existing automated techniques to identify ideology from text focus on bags of words or wordlists, ignoring syntax. Taking inspiration from recent work in sentiment analysis that successfully models the compositional aspect of language, we apply a recursive neural network (RNN) framework to the task of identifying the political position evinced by a sentence. To show the importance of modeling subsentential elements, we crowdsource political annotations at a phrase and sentence level. Our model outperforms existing models on our newly annotated dataset and an existing dataset.
Emotion is an important element in expressive speech synthesis. Unlike traditional discrete emotion simulations, this paper attempts to synthesize emotional speech by using "strong", "medium", and "weak" classifications. This paper tests different models, a linear modification model (LMM), a Gaussian mixture model (GMM), and a classification and regression tree (CART) model. The linear modification model makes direct modification of sentence F0 contours and syllabic durations from acoustic distributions of emotional speech, such as, F0 topline, F0 baseline, durations, and intensities. Further analysis shows that emotional speech is also related to stress and linguistic information. Unlike the linear modification method, the GMM and CART models try to map the subtle prosody distributions between neutral and emotional speech. While the GMM just uses the features, the CART model integrates linguistic features into the mapping. A pitch target model which is optimized to describe Mandarin F0 contours is also introduced. For all conversion methods, a deviation of perceived expressiveness (DPE) measure is created to evaluate the expressiveness of the output speech. The results show that the LMM gives the worst results among the three methods. The GMM method is more suitable for a small training set, while the CART method gives the better emotional speech output if trained with a large context-balanced corpus. The methods discussed in this paper indicate ways to generate emotional speech in speech synthesis. The objective and subjective evaluation processes are also analyzed. These results support the use of a neutral semantic content text in databases for emotional speech synthesis.
Previous neuroimaging studies have suggested that developmental dyslexia has a different neural basis in Chinese and English populations because of known differences in the processing demands of the Chinese and English writing systems. Here, using functional magnetic resonance imaging, we provide the first direct statistically based investigation into how the effect of dyslexia on brain activation is influenced by the Chinese and English writing systems. Brain activation for semantic decisions on written words was compared in English dyslexics, Chinese dyslexics, English normal readers and Chinese normal readers, while controlling for all other experimental parameters. By investigating the effects of dyslexia and language in one study, we show common activation in Chinese and English dyslexics despite different activation in Chinese versus English normal readers. The effect of dyslexia in both languages was observed as less than normal activation in the left angular gyrus and in left middle frontal, posterior temporal and occipitotemporal regions. Differences in Chinese and English normal reading were observed as increased activation for Chinese relative to English in the left inferior frontal sulcus; and increased activation for English relative to Chinese in the left posterior superior temporal sulcus. These cultural differences were not observed in dyslexics who activated both left inferior frontal sulcus and left posterior superior temporal sulcus, consistent with the use of culturally independent strategies when reading is less efficient. By dissociating the effect of dyslexia from differences in Chinese and English normal reading, our results reconcile brain activation results with a substantial body of behavioural studies showing commonalities in the cognitive manifestation of dyslexia in Chinese and English populations. They also demonstrate the influence of cognitive ability and learning environment on a common neural system for reading.
Now that metaphor is recognized as being pervasive in language, it is argued that more attention should be given to the teaching of strategies for comprehending and generating metaphors in L2. In this article we report on a translation exercise undertaken by advanced Polish learners of English which revealed ways in which metaphorical expressions vary between the two languages, and the problems this raises for learners. It is suggested that awareness-raising through discussion and comparison of metaphors in L1 and L2 is a useful approach to help learners to understand and appropriately produce metaphores. This is followed by some sample teaching materials which have been designed to encourage learners to investigate and compare metaphors in L1 and L2.
This article surveys seven ELT texts that are organized around the teaching of functions in order to explicate several problems evident in their presentation of speech acts. A specific speech act sequence, that of complaint/commiseration, is the focus of the analysis. This speech behaviour is highlighted in order to demonstrate the mismatch between data from spontaneous speech, and data that is contrived through the native speaker intuitions of textbook developers. A first problem is that intuition about speech act realization often differs greatly from the way in which naturalistic speech patterns out. Second, it is demonstrated that important information on underlying social strategies of speech acts is often overlooked entirely. A sample lesson on complaining/commiserating based on spontaneous speech is offered, to draw a contrast with the lessons on complaining presented in the texts surveyed.
Many experimental studies over the last two decades have suggested that groups of children who suffer significant delay in reading also show a weakness in phoneme discrimination and identification. In order to look further at the relation between type of reading deficit, auditory acuity, and speech discrimination, a group of 13 children with specific reading difficulty (SRD), 12 chronological-age controls, and 12 reading-age controls were tested on a battery of speech-perceptual, psychoacoustic, and reading tests. A sub-group of children with Specific Reading Difficulty (SRD) were poor at speech discrimination tests, whereas the rest of the SRD group performed within norms. For this sub-group, discrimination performance was particularly poor for consonant contrasts differing in a single feature that was not acoustically salient, and problems were encountered with nasal and fricative contrasts as wells with stop contrasts. These children did not differ from controls in their performance on non-speech psychoacoustic tasks. An evaluation is made of the reported phonemic awareness skills of beginning readers with regard to speech-processing issues which may help in understanding what factors are important in reading development.
This article examines the argumentative talk of a preadolescent girls' peer group demonstrating both the co‐construction of microinteractional identities as well as the coproduction of macro‐social identity categories, such as race, class, and gender. Activities of social aggression are performed through embodied styling and stancetaking in the midst of oppositional moves towards a “tagalong” girl. Through transmodal stylization girls openly mock an African American working‐class girl using talk associated with wealthy white “Valley Girls,” while simultaneously producing gestures associated with working‐class black “Ghetto Girls.” Through the use of different communicative modalities girls simultaneously index multiple culturally salient representations. [stance, style, peer group, conflict talk, identity]
Abstract The bouba/kiki effect—the association of the nonce word bouba with a round shape and kiki with a spiky shape—is a type of correspondence between speech sounds and visual properties with potentially deep implications for the evolution of spoken language. However, there is debate over the robustness of the effect across cultures and the influence of orthography. We report an online experiment that tested the bouba/kiki effect across speakers of 25 languages representing nine language families and 10 writing systems. Overall, we found strong evidence for the effect across languages, with bouba eliciting more congruent responses than kiki. Participants who spoke languages with Roman scripts were only marginally more likely to show the effect, and analysis of the orthographic shape of the words in different scripts showed that the effect was no stronger for scripts that use rounder forms for bouba and spikier forms for kiki. These results confirm that the bouba/kiki phenomenon is rooted in crossmodal correspondence between aspects of the voice and visual shape, largely independent of orthography. They provide the strongest demonstration to date that the bouba/kiki effect is robust across cultures and writing systems. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part II)’.
In the photogrammetry field, interest in region detectors, which are widely used in Computer Vision, is quickly increasing due to the availability of new techniques. Images acquired by Mobile Mapping Technology, Oblique Photogrammetric Cameras or Unmanned Aerial Vehicles do not observe normal acquisition conditions. Feature extraction and matching techniques, which are traditionally used in photogrammetry, are usually inefficient for these applications as they are unable to provide reliable results under extreme geometrical conditions (convergent taking geometry, strong affine transformations, etc.) and for bad-textured images. A performance analysis of the SIFT technique in aerial and close-range photogrammetric applications is presented in this paper. The goal is to establish the suitability of the SIFT technique for automatic tie point extraction and approximate DSM (Digital Surface Model) generation. First, the performances of the SIFT operator have been compared with those provided by feature extraction and matching techniques used in photogrammetry. All these techniques have been implemented by the authors and validated on aerial and terrestrial images. Moreover, an auto-adaptive version of the SIFT operator has been developed, in order to improve the performances of the SIFT detector in relation to the texture of the images. The Auto-Adaptive SIFT operator (A(2) SIFT) has been validated on several aerial images, with particular attention to large scale aerial images acquired using mini-UAV systems.
We propose a diagnostic method for probing specific information captured in vector representations of sentence meaning, via simple classification tasks with strategically constructed sentence sets. We identify some key types of semantic information that we might expect to be captured in sentence composition, and illustrate example classification tasks for targeting this information.
The origins of the Indo-European language family are hotly disputed. Bayesian phylogenetic analyses of core vocabulary have produced conflicting results, with some supporting a farming expansion out of Anatolia ~9000 years before present (yr B.P.), while others support a spread with horse-based pastoralism out of the Pontic-Caspian Steppe ~6000 yr B.P. Here we present an extensive database of Indo-European core vocabulary that eliminates past inconsistencies in cognate coding. Ancestry-enabled phylogenetic analysis of this dataset indicates that few ancient languages are direct ancestors of modern clades and produces a root age of ~8120 yr B.P. for the family. Although this date is not consistent with the Steppe hypothesis, it does not rule out an initial homeland south of the Caucasus, with a subsequent branch northward onto the steppe and then across Europe. We reconcile this hybrid hypothesis with recently published ancient DNA evidence from the steppe and the northern Fertile Crescent.
To date, research examining adherence to genetic counseling principles has focused on specific counseling activities such as the giving or withholding of information and responding to client requests for advice. We audiotaped 43 prenatal genetic counseling sessions and used data-driven, qualitative, sociolinguistic methodologies to investigate how language choices facilitate or hinder the counseling process. Transcripts of each session were prepared for sociolinguistic analysis of the emergent discourse that included studying conversational style, speaker-listener symmetry, directness, and other interactional patterns. Analysis of our data demonstrates that: 1) indirect speech, marked by the use of hints, hedges, and other politeness strategies, facilitates rapport and mitigates the tension between a client-centered relationship and a counselor-driven agenda; 2) direct speech, or speaking literally, is an effective strategy for providing information and education; and 3) confusion exists between the use of indirect speech and the intent to provide nondirective counseling, especially when facilitating client decision-making. Indirect responses to client questions, such as those that include the phrases "some people" or "most people," helped to maintain counselor neutrality; however, this well-intended indirectness, used to preserve client autonomy, may have obstructed direct explorations of client needs. We argue that the genetic counseling process requires increased flexibility in the use of direct and indirect speech and provide new insights into how "talk" affects the work of genetic counselors.
This paper presents a set of classification experiments for identifying depression in posts gathered from social media platforms. In addition to the data gathered previously by other researchers, we collect additional data from the social media platform Reddit. Our experiments show promising results for identifying depression from social media texts. More importantly, however, we show that the choice of corpora is crucial in identifying depression and can lead to misleading conclusions in case of poor choice of data.
There is a consensus that the incidental vocabulary learning stemming from reading is an essential complement to the explicit teaching of vocabulary (e.g., Coady & Huckin, 1997; Schmitt & McCarthy, 1997). A major reason for this consensus is that the number of words necessary for effective lan-guage use is greater than that which can be taught easily (although see Meara, 1995, 1998, for a rebuttal to the limitations of vocabulary teaching). Estimates for the num-ber of words required vary from about 2,000 for everyday oral ability (Schonell, Meddleton, & Shaw, 1956) to 10,000 or more for reading academic texts (Hazenberg
Abstract. This paper argues that Mandarin Chinese clauses exhibit the finite/nonfinite contrast, and, based on this discovery, shows that the EPP is the driving force for A‐movement. The evidence is the raising of arguments from the TP complements of different kinds of modals. It is argued that the epistemic modals in Mandarin Chinese take a finite TP complement, whereas the modal hui ‘will’ and the root modals take a nonfinite TP complement. Though the epistemic modals take a finite TP complement, they nonetheless permit subject‐to‐subject raising. This phenomenon can be accounted for if we assume that the EPP drives A‐movement, and agreement blocks it. Mandarin Chinese does not have grammatical features; as a consequence, the subject of a finite clause does not perform checking of grammatical features, and thus is free to raise. This phenomenon, therefore, is evidence against the checking‐based theory of A‐movement. If feature checking is involved in raising in Mandarin Chinese sentences, extra assumptions must be made, with a heavy burden of proof on the checking‐based theory.
The goal of this experiment is to find the most important phonetic features of Dutch accent-lending pitch movements, in terms of shape, pitch level and alignment with the segmental structure. Time pressure is used as a heuristic method to isolate important phonetic aspects of pitch movements, assuming that under time pressure the speaker will preserve those aspects. In a production experiment, accent-lending rises ('1') and falls ('A') were realized under various types of time pressure. The pitch rise is time-compressed under all pressure types, which would mean that the shape of the rise is relatively unimportant. The segmental alignment of the rise proved to be more important: the onset of the rise is synchronized with the syllable onset. For the fall no fixed synchronization point was found, but its shape was relatively invariant, indicating that shape rather than exact timing is the more important feature of the fall.
Four experiments exploring the effects of the coherence of a mental representation of material on reasoning performance are presented. Each employs a simple task that allows most subjects at some stage to solve the problem. We postulate that the crucial factor influencing performance is a unified representation of the material. In Experiment 1 we use an authorization of a kind familiar in daily life, and in Experiments 2, 3 and 4 we use sentences describing simple objects in different ways. In each case performance was enhanced when the material could be given a unified representation.
Summary While it may well be the case that proto‐Volta had a CHVH system which has been reduced to a seven‐vowel system in some daughter languages, there is good reason to doubt whether this reduction process is as common as has been supposed. There are also strong reasons for questioning the claim that the loss of / i / and /u/ from CHVH systems is motivated by a greater degree of articulatory effort involved in the production of these sounds. In place of this articulation‐based account, I have argued for an alternative explanation, based on the auditory similarity of these vowels to neighboring vowels in the system. This alternative view is able to account for some additional empirical facts that are anomalous under the articulatory view, including the fact that III and Ivl are often extremely common in ninevowel languages, that [ i] and [u] may survive allophonically in languages in which they have been lost as independent phonemes, and the existence of seven-vowel systems in which / e , o/, rather than / i , u/, have been lost. Notes I would like to thank Peggy MacEachern, Russ Schuh, Keith Snider, Robert Stockwell, and an anonymous reviewer for their helpful remarks and suggestions on this paper.