Session Poster VII:

Poster VII

Type: poster
Chair: Matthias Jilka, Jacques Koreman
Date: Friday - August 10, 2007
Time: 10:00
Room: Poster Area

 

Poster VII-1 Effects of phoneme repetition in spoken utterance generation
Markus Damian, University of Bristol
Nicolas Dumay, University of Bristol
Paper File
  The degree of phonological advance planning in spoken production was investigated with a paradigm in which speakers performed speeded naming responses to coloured line drawings of objects. Colours and object names were chosen such that a phoneme matched, or mismatched, between adjective and noun. A facilitatory effect of repeated phoneme was demonstrated, which was found not only when the phoneme occupied the word-initial position (“green goat”), but also in the central (“black pan”) or word-final (“black monk”) position. These results imply that speakers planned the phonological content of the entire phrase before starting their articulation. A facilitatory effect was additionally found when the repeated phoneme occupied a different position within each word (“green flag”). The latter result suggests that the spoken production system represents segments independently of their position within a word.
Poster VII-3 EMA AND THE CRUX OF CALIBRATION
Andreas Zierdt, Institute of Phonetics and Speech Processing, LMU Munich
Paper File
  Electro-Magnetic-Articulography has been a well-established technology for many years The new AG500 System even allows the investigation of articulatory movements in three dimensions. Still, calibration is a crucial point to obtain reliable and accurate data. After a short glance of the mathematical background and a review of previous methods, the present Circal device is discussed. Due to its construction a basic calibration problem probably remains, since the Circal neglects sensor orientations outside of the x/y-plane. Depending on the sensors actual position and orientation, a suboptimal calibration can have a mild or dramatic influence on the position calculation, which might even fail. Several approaches are thinkable to overcome the calibration problem, three types are discussed, which can be characterized as mechanical, physical, and mathematical solution. Finally the actual work on a mathematical solution is briefly presented.
Poster VII-5 SOUND DELETION IN COLLOQUIAL PERSIAN
Shahrbano Suzanne ASSADI, Laboratoire de Phonétique et Phonologie (UMR7018) CNRS/ Sorbonne Nouvelle
Paper File
  Among the differences that distinguish colloquial Persian from the formal variety is the deletion of sounds. This study is based on a corpus of 20 minutes of conversation with three native speakers from Tehran. The results show that 6% of sounds are deleted in colloquial Persian. Consonants are more likely to be deleted than vowels. Among consonants, the most frequent are the dentals and their deletion occur at the end of the syllable, especially in consonant clusters with the same place of articulation. The deletion affects more grammatical words than lexical ones.
Poster VII-7 Lingual Co-occurrence Constraints in Babbling: An Acoustical Study
Christine L. Matyear, The University of Texas at Austin
Paper File
  Acoustical measurements of F2 transitions of 370 babbled CV sequences showed that places of articulation for consonantal closure correlate highly with vocalic tongue frontness/backness. This finding is interpreted as a confirmation of the MacNeilage & Davis Frame-Content theory of speech development.
Poster VII-9 DYNAMIC PHONETIC DETAIL IN LEXICAL REPRESENTATIONS
Christo Kirov, Department of Linguistics, New York University
Adamantios Gafos, Department of Linguistics, New York University
Paper File
  A dynamical model of phonetic detail is presented. The model is compared to an exemplar-based model, which has been shown to offer an account of (presumed) frequency-dependent lenition processes. The dynamical model is shown to account for the same lenition patterns. However, there is a key difference. In contrast to the exemplar model, the dynamical model is inherently temporal. This provides a handle on the temporal dimension of assembling phonological representations.
Poster VII-11 THE PHONETIC EVOLUTION OF REDUPLICATED EXPRESSIONS: REDUPLICATION, LEXICAL TONES AND PROSODY IN NA (NAXI)
Alexis Michaud, Langues et Civilisations à Tradition Orale, CNRS/ Sorbonne/ Sorbonne Nouvelle
Jacqueline Vaissière, Laboratoire de Phonétique et Phonologie, CNRS/ Sorbonne Nouvelle
Paper File Additional Files
  In Na, a Sino-Tibetan language with lexical tones, some reduplication schemes involve tone change, whereas others consist in full reduplication without tonal change. The synchronic coexistence of these two sets allows for an experimental comparison, which leads to a simple explanation. All the reduplication schemes of Na appear to originate in total reduplication, without tone change, the schemes which now involve tone change resulting from a later evolution: the phonologisation of the effect of intonational boundaries on pitch. A High tone in final position within the reduplicated compound is lowered to Mid; an initial Low tone is raised, also to Mid. A reflection is set out concerning the historical conditions under which the allophonic variation of lexical tones could be reinterpreted as a difference of tonal categories.
Poster VII-13 Affective speech gating
Ioulia Grichkovtsova, CRISCO, Université de Caen
Anne Lacheret, MoDyCo, Université Paris X
Michel Morel, CRISCO, Université de Caen
Virginie Beaucousin, GIN, CNRS UMR 6194, GIP Cyceron
Nathalie Tzourio-Mazoyer, GIN, CNRS UMR 6194, GIP Cyceron
Paper File
  This study tested the hypothesis that emotions may be identified earlier than attitudes in the flow of speech. The gating paradigm was chosen to investigate if such differentiation between emotions and attitudes was possible. Perception test results included the following variables: the identification point, the isolation point and the confusion matrices. Acoustic analysis was conducted and linked to the perception results. Anger and sadness were separated from the other studied affective states on the basis of the results analysis. Interestingly, happiness followed the identification pattern found for attitudes. The future directions of work are presented.
Poster VII-15 DIFFERENTIAL HEIGHT SPECIFICATION IN FRONT VOWELS FOR GERMAN SPEAKERS AND TURKISH-GERMAN BILINGUALS: AN ELECTROENCEPHALOGRAPHIC STUDY
Silvia Lipski, University of Konstanz
Aditi Lahiri, University of Konstanz
Carsten Eulitz, University of Konstanz
Paper File
  Despite similar phonetics, phonological analyses suggest a differential tongue height specification of the vowels /i/ and /e/ in Turkish and German. This was tested by use of the mismatch negativity (MMN), an automatic change detection response of the brain, which was recorded for Turkish-German bilinguals and German listeners. Our results support the predictions about the differential specification of tongue height features, i.e. in Turkish /e/ is specified for [LOW] and not underspecified as in German; whereas /i/ is underspecified for height in Turkish and specified for [HIGH] in German.
Poster VII-17 SEX-SPECIFIC DIFFERENCES IN f0 AND VOWEL SPACE
Adrian P. Simpson, Friedrich-Schiller-Universität Jena
Christine Ericsdotter, Friedrich-Schiller-Universität Jena
Paper File
  It has been suggested that the larger area of the average female acoustic vowel space is a consequence of compensating for poorer harmonic sampling of the spectral envelope resulting from a higher f0. This predicts that there should be variation in vowel space size within any group of males or females representing sufficient interindividual range of average f0. Inspired by this, the present paper examines whether there is a correlation between a speaker's f0 and the size of the speaker's F2xF1 vowel space. A highly significant correlation between \fo and vowel space size is found in the female group of a sample of 87 German students. However, no such correlation is found between f0 and the Euclidean distance between same speaker tokens of /e:/ and /a:/.
Poster VII-19 Spectral and durational properties of vowels in Kunwinjku
Janet Fletcher, University of Melbourne
Hywel Stoakes, University of Melbourne
Deborah Loakes, University of Melbourne
Andrew Butcher, Flinders University
Paper File
  In this paper we investigate the spectral properties of vowels in a Northern Australian language, Kunwinjku. The language illustrates typical vowel dispersion patterns of other languages of the region, and of 5-vowel languages in general. The spectral properties of vowels suggest a system of sufficient dispersion, with phonemic close vowels being realized in the close/mid-close region of the vowel space and with a general anchoring of the system by an open central vowel. Vowel height also interacts with vowel segment duration, with open vowels being generally longer than relatively close vowels.
Poster VII-21 On the acoustic characteristics of French schwa
Cécile Fougeron, Laboratoire de Phonétique et Phonologie, UMR7018 CNRS-Sorbonne Nouvelle
Cédric Gendrot, Laboratoire de Phonétique et Phonologie, UMR7018 CNRS-Sorbonne Nouvelle
Audrey Bürki, Laboratoire de Phonétique et Phonologie, UMR7018 CNRS-Sorbonne Nouvelle and Laboratoire de Psycholinguistique Expérimentale, Université de Genève
Paper File
  This paper presents an acoustic study on the phonetic properties of French schwa based on the analysis of a large corpus of radio broadcasted news. In order to address the question of whether the optional status of schwa correlates with a specific phonetic nature, optional French schwa is compared to its neighboring full front rounded vowels /2/ and /9/, and to obligatory schwas. While optional schwas overlap with the acoustical space of its neighbors (being closer to /2/), it differs from both /2/ and /9/ in terms of aperture, degree of rounding, duration and variability of F2. Optional schwas differ from obligatory schwas by a greater aperture.
Poster VII-23 Sentence-domain effects on tonal alignment in Italian?
Caterina Petrone, Laboratoire Parole et Langage (LPL) & Université de Provence
D. Robert Ladd, University of Edinburgh
Paper File
  In a production experiment, we investigated sentence-domain effects on the alignment of Italian accents, and found that the nuclear peak is aligned earlier in long sentences than in short sentences. These findings are superficially contrary to traditional “time-pressure” explanations for variability in tonal alignment and raise some questions about the domain of pitch gestures. When the effects of sentence duration on speaking rate are taken into account, however, our results may be consistent with much previous work.
Poster VII-25 PHONOLOGICAL CONTEXT EFFECTS FOR VOICING AND DEVOICING IN FRENCH
Isabelle Darcy, University of Tuebingen
Frank Kügler, University of Potsdam
Paper File
  We examine occurrences of categorical assimilation (neutralizations) in French, voiced and unvoiced word-final obstruents, and their perception in different phonological contexts. We first show the categorical nature of the alternation, supported in Exp. 2 by perceptual categorization data. In Exp. 3, the interpretation of this first percept appears to be corrected in certain contexts, inducing compensation. We argue that context effects are phonological in this case, rather than auditory or phonetic. We conclude that linguistic knowledge of alternations is necessary in compensation for categorical assimilation.
Poster VII-27 The effect of incredulity and particle on the intonation of yes/no questions in Taiwan Mandarin
Yu-Ying Chuang, Graduate Institute of Linguistics, National Taiwan University
Yi-Hsuan Huang, Graduate Institute of Linguistics, National Taiwan University
Janice Fon, Graduate Institute of Linguistics, National Taiwan University
Paper File
  This study explored the effect of incredulity and particle on the intonation of yes/no questions in Taiwan Mandarin. Two types of questions were examined – ones with and without the question particle ma. Results showed that to convey incredulity, the overall pitch would be raised and enlarged. Moreover, questions without particles are significantly higher in pitch and larger in pitch range than questions with particles. This thus led to a conclusion that the degree of incredulity being expressed in questions with ma might not be as great as that in questions without ma.
Poster VII-29 Tone and Quantity in the Limburgian Dialect of Neerpelt
Jörg Peters, Radboud University Nijmegen
Paper File
  The Limburgian dialect of Neerpelt is located in the northwestern corner of an area whose dialects are known for having a lexical tone contrast. It is not clear whether Neerpelt still belongs to the tonal dialects of Limburg, and there are other dialects in northwestern Limburg using a quantity contrast in place of the tonal contrast. To examine whether the dialect of Neerpelt has a tonal contrast, two reading tasks were carried out using tonal minimal pairs from other Limburgian dialects as target words in different prosodic contexts. The results suggest that the dialect of Neerpelt has both pitch differences which cannot be reduced to durational differences and durational differences which can-not be reduced to a quantity contrast. We conclude that the dialect of Neerpelt has a lexical tone contrast comparable to the contrast in other tonal dialects of Limburg.
Poster VII-31 The Continuum of Speech Rhythm: Computational Testing of Speech Rhythm of Large Corpora from Natural Chinese and English Speech.
Matthew Benton, The University of Texas at Arlington
Liz Dockendorf, The University of Texas at Arlington
Wenhua Jin, The University of Texas at Arlington
Yang Liu, The University of Texas at Dallas
Jerold Edmondson, The University of Texas at Arlington
Paper File Additional Files
  Past research on the dichotomy of language rhythm classes (stress- vs. syllable-timing) has typically been performed on constructed speech data, e.g. "The North Wind and the Sun" text. Our research goes beyond the previously established speech rhythm studies by combining: (1) a data set of 175 minutes of audio from large corpora of natural English and Chinese speech and (2) natural language processing techniques to compute phonetic segment-statistics. Our findings generally agree with the previous result that Chinese and English fall into distinct rhythm categories. However, when individual speaker data were considered in our analysis, an overlapping continuum across both languages was shown to exist. These results indicate that using "ideal" data to measure speech rhythm does not fully explain the division between languages.
Poster VII-33 Perception and Production in Pitch Accent System of Korean
Jungsun Kim, Indiana University
Kenneth de Jong, Indiana University
Paper File Additional Files
  This research investigates dialectal variations of pitch accent system in Korean. Specifically, this paper is focused on how speakers of a non-lexical pitch accent dialect are influenced by a lexical pitch accent dialect. Three experiments have participants from two dialectal regions produce pitch accent minimal pairs, and imitate and identify continua spanning pitch accent categories. Results show general correlation between productions and imitations and identifications in Kyungsang Korean speakers, and clear cases of divergence in Cholla speakers. Identification patterns suggest a variety of categorization schemes in these speakers, while their imitation results consistently indicate a lack of robust categorization.
Poster VII-35 VP Focus and Narrow Focus in Korean
Sun-Ah Jun, UCLA
Hee-Sun Kim, Stanford
Paper File
  According to the Focus Projection theory, a focused word projects its focus to a larger syntactic constituent. When a Verb Phrase (VP) has two arguments (e.g., "gave a boy a book"), focus on the verb-final argument licenses focus on the VP. According to the Information Packaging theory of focus applied to Korean, focus on a theme argument licenses focus on the VP. However, production data of Korean focus supports neither theory. Results show that in Korean a VP-initial argument is the most prominent in a sentence with VP focus regardless of the order or the type of the arguments, but is still not as prominent as the VP-initial word receiving narrow focus.
Poster VII-37 FORMANT STRUCTURES OF VOWELS PRODUCED BY STUTTERERS AT NORMAL AND FAST SPEECH RATES
Fabrice HIRSCH, Phonetics Institute of Strasbourg - Speech and Cognition Group
Florence Fauvet, Centre Hospitalier Universitaire de Strasbourg // Phonetics Institute of Strasbourg - Speech and Cognition Group
Véronique FERBACH-HECKER, Phonetics Institute of Strasbourg - Speech and Cognition Group
Marion BECHET, Phonetics Institute of Strasbourg - Speech and Cognition Group
Fayssal BOUAROUROU, Phonetics Institute of Strasbourg - Speech and Cognition Group
Jean STURM, Phonetics Institute of Strasbourg - Speech and Cognition Group
Paper File
  The aim of this study is to analyse the steady—state portion of the first two formants (F1) and (F2) in the production of [CVp] sequences, containing vowels [i, a, u] pronounced in two speech rates (normal and fast), by groups of untreated and treated stutterers, and control subjects. Comparing data between the three groups of speakers, a reduction of vowel space is observed for stutterers at a normal speaking rate. When speech rate increases, no reduction of vowel space is noticeable, contrary to treated stutterers and controls.
Poster VII-39 DISCRIMINATION OF LEVEL TONES IN CANTONESE-LEARNING INFANTS
Ka Yan Margaret Lei, Language Acquisition Laboratory, The Chinese University of Hong Kong
Paper File
  This is an exploratory study on the perception of Cantonese tones in infants learning Hong Kong Cantonese as their native language. In the study, we examined whether infants at the ages of 6- to 8-months old possess the ability to discriminate level tones. Our findings revealed that infants were capable of discriminating at least some of the tonal contrasts in Cantonese. The results showed evidence for a possible relationship between the ease of tone discrimination and the degree of acoustic similarity between the tones. Among the three pairs of Cantonese level tones tested in our study, the pair having the greatest F0 difference, the high-level tone (T1) with the mid-low level tone (T6), was best discriminated as compared with the other two tone pairs which were acoustically closer in terms of F0 values.
Poster VII-41 Speech Perception and Transition of Sound Change
Ching-Pong AU, Laboratoire Dynamique du Langage, CNRS-Lyon2
Paper File
  A dynamic multi-agent model was built in order to simulate language acquisition and sound change in a speech community. The simulation results provide plausible solutions that resolve some controversial issues regarding the sound change implementation such as the discrepancy between the Neogrammarian hypothesis and the lexical diffusion hypothesis. In the simulations, the patterns described by the two seemingly contradictory hypotheses both exist in the implementation of sound changes depending on the consistency of perceptual responses of the speakers in the population.
Poster VII-43 Voice Onset Time and the Scottish Vowel Length Rule in Aberdeen English
Dominic J.L. Watt, University of Aberdeen
Jillian H. Yurkova, University of Aberdeen
Paper File
  Voice Onset Time (VOT) was measured in word-initial /p t k b d g/ in carrier words read from lists by 9 speakers of Aberdeen English (AE). Vowel durations for /i e E a O o u/ and /ai/ were also measured so as to assess the extent to which the Scottish Vowel Length Rule (SVLR) [2] operates in the Aberdeen vowel system.
Poster VII-45 EFFECTS OF LENGTH OF RESIDENCE AND SPEECH ACTIVITIES ON DEGREE OF FOREIGN ACCENT
Xinchun Wang, California State University, Fresno
Jianhong Chen, The University of Shanghai for Science and Technology
Paper File
  A group of native Mandarin speaking professors teaching in a US university with a mean length of residence (LOR) of 12 years in North America was rated as accented as a group of native Mandarin speaking professors teaching English in China. Different speech activities did not appear to affect degree of accent. However, long excerpt of filtered speech may be used with caution for accent rating.
Poster VII-47 Quantifying the interlanguage speech intelligibility benefit
Hongyan Wang, Dept. of English, Shenzhen University, PR China
Vincent J. van Heuven, Phonetics Laboratory, Leiden University Centre for Linguistics
Paper File Additional Files
  Generally, native listeners of a target language are better at understanding foreign-accented speech than any other type of listener, with one possible exception: if the listener speaks the same mother tongue as the speaker, e.g. when Chinese speakers and listeners communicate in English, the inform¬ation transfer may be more successful than with a native English listener. We review literature data, and present results of our own in an attempt to come up with the optimal quantification of this so-called interlanguage speech intelligibility effect. We argue that the benefit is best quantified in relative terms, as the residual in a linear model that remains after the main effects of speaker and hearer language background have been included.
Poster VII-49 TEMPO-NORMALIZED MEASUREMENT AND TEST SET DEPENDENCY IN OBJECTIVE EVALUATION OF ENGLISH LEARNERS' TIMING CHARACTERISTICS
Shizuka NAKAMURA, GITI, Language and Speech Science Res. Labs, Waseda University, Japan
Hajime TSUBAKI, GITI, Language and Speech Science Res. Labs, Waseda University, Japan
Yusuke KONDO, School of Education, Language and Speech Science Res. Labs, Waseda University, Japan
Michiko NAKANO, School of Education, Language and Speech Science Res. Labs, Waseda University, Japan
Yoshinori SAGISAKA, GITI, Language and Speech Science Res. Labs, Waseda University, Japan
Paper File
  In this paper, we present experimental results on tempo-normalized measurements and sentence sets for objective evaluation of English speech timing by Japanese learners. Phone-independent versus phone-dependent tempo normalizations were compared using raw duration differences between learners and native speakers. To observe the effect of test sentence differences, sentence length was adopted as a criterion. Through experiments, high correlations between subjective judgment and duration differences with normalization showed remarkable advantage of phone-dependent normalization. Large correlation differences between long sentences and short sentences indicated the need of careful choice of test materials. The subjective score estimation by linear regression showed better performance using long sentences and duration differences with phone-dependent normalization than conventional one using all sentences and duration differences without normalization.
Poster VII-51 The perception of Italian and Spanish lexical stress: a first cross-linguistic study
Iolanda Alfano, Università degli Studi di Salerno - Universitat Autònoma de Barcelona
Joaquim Llisterri, Universitat Autònoma de Barcelona
Renata Savy, Università degli Studi di Salerno
Paper File
  A preliminary experiment studying the perception of lexical stress in isolated Italian words by Spanish subjects has been carried out in order to find out possible cross-linguistic differences in closely related languages. The results show that there is a combined effect of native language expectations and acoustic information present in the signal.
Poster VII-53 AUDITORY-PERCEPTUAL IDENTIFICATION OF VOICE QUALITY BY EXPERT AND NON-EXPERT LISTENERS
Olaf Köster, Bundeskriminalamt
Michael Jessen, Bundeskriminalamt
Freshta Khairi, University of Bonn
Hartwig Eckert, University of Flensburg
Paper File
  In a perception task 13 types of voice quality were to be identified by two listener groups. Expert listeners with a professional background in forensic phonetics performed significantly better than the non-expert group. Furthermore, the non-experts produced more heterogeneous types of error. For prominent types of voice quality and stimuli with a strong scalar degree low error rates were observed for the experts.
Poster VII-55 Influences of Pitch and Speech Rate on the Perception of Age from Voice
Ralf Winkler, Technical University Berlin
Paper File
  Listeners are able to rate a speaker's age with reasonable accuracy. Although several speech features are known to be characteristic for specific age groups, there is less knowledge about the perceptual relevance of that parameters. This paper describes the results of a perception study, where single word stimuli were synthesized and rated regarding the perceived age by 20 listeners. All combinations of pitch and speech rate were synthesized with male and female voices. Results show that speech rate had the largest impact on listeners' judgement. Although pitch variations alone did not show a large impact on listeners' judgements, significant differences between selected pitch levels at slow and fast speech exist. Our results contribute to the identification of the relevant features signaling a speaker's age. Results further support the assumption that a set of parameters almost always interact in signaling a speaker's age.
Poster VII-57 IDENTIFYING AND EVALUATING APRAXIC SPEECH DEFICITS USING MAGNETOMETRY
Dani Byrd, USC Department of Linguistics; Haskins Laboratories
Katherine S. Harris, Haskins Laboratories, New Haven, CT
Paper File
  An understanding of the relationship of speech and language symptoms to lesions in the frontal region of the dominant hemisphere depends on a fuller description of the speech phenomena than can be provided by transcriptional or acoustic investigation alone. This paper provides examples of how articulatory movement tracking can aid in describing apraxic speech deficits.
Poster VII-59 A PHONETIC AND PHONOLOGICAL STUDY OF SO-CALLED ‘BUCCAL’ SPEECH PRODUCED BY TWO LONG-TERM TRACHEOSTOMISED CHILDREN
Harveen Khaila, City & Hackney Teaching PCT
Jill House, Dept of Phonetics & Linguistics, UCL
Lesley Cavalli, Speech and Language Therapy Dept, Great Ormond Street Hospital (GOSH)
Elizabeth Nash, Speech and Language Therapy Dept, Great Ormond Street Hospital (GOSH)
Paper File
  Analysis of the ‘buccal’ speech spontaneously developed by two long-term tracheostomised children reveals speaker-specific strategies for setting air in motion, for generating a source of sound to replace normal voice, and for articulating vowels and consonants. The implications for communicating phonological contrasts are discussed.
Poster VII-61 CORRELATES OF TEMPORAL HIGH-RESOLUTION FIRST FORMANT ANALYSIS AND GLOTTAL EXCITATION
Manfred Pützer, Institut für Phonetik, Universität des Saarlandes, Saarbrücken
Wolfgang Wokurek, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart
Paper File
  This preliminary study visualizes the glottal excitation in a temporally highly resolved estimate of the first formant. Instantaneous estimates of the frequency and bandwidth of the first formant closely follow the electroglottographic contour. This is demonstrated for modal, breathy, and hoarse phonation of an [a:] produced by one male and one female speaker. The temporally highly resolved formant contours show glottal features such as the different durations of the open phase and fundamental frequency and/or amplitude perturbations of the vocal fold vibration. Keywords: linear prediction, electroglottography
Poster VII-63 The Phonetics of Emphasis
Klaus J. Kohler, Institute of Phonetics and Digital Speech Processing (IPDS), Christian-Albrechts-University, Kiel
Oliver Niebuhr, Institute of Phonetics and Digital Speech Processing (IPDS), Christian-Albrechts-University, Kiel
Paper File
  Research is reported in a framework linking phonetic exponents to communicative functions. From the heterogeneous field of ‘emphasis’, two areas are selected: ‘positive/negative expressive intensification’ of verbal meaning, e.g. it’s delicious! vs it stinks! German data are collected in controlled monologues and dialogues. On the hypothesis that ‘positive emphasis’ strengthens sonority, ‘negative emphasis’ weakens it, aspects of f0, acoustic energy, duration, voice quality are tested statistically.
Poster VII-65 AN INCREMENTAL ANALYSIS OF DIFFERENT FEATURE GROUPS IN SPEAKER INDEPENDENT EMOTION RECOGNITION
Marko Lugger, Chair of System Theory and Signal Processing, University of Stuttgart
Bin Yang, Chair of System Theory and Signal Processing, University of Stuttgart
Paper File
  This paper investigates the classification of different emotional states using speech features from different feature groups. We use both suprasegmental feature groups like pitch, energy, and duration and segmental feature groups like voice quality, zero crossing rate, and articulation. We want to exploit the selection of the most relevant features from these different feature groups to get a better understanding of the speaker independent emotion recognition. We study how these different feature groups overlap or complement each other. By using the sequential floating forward selection algorithm (SFFS), feature subsets maximizing the classification rate will be generated. For this purpose, we use a Bayesian classifier and a speaker independent cross validation. A detailed study is also done on the relevance of the feature groups for classifying different emotion dimensions known from the psychological emotion research.
Poster VII-67 AN ACOUSTIC DESCRIPTION OF HIGH VOWEL SYNCOPE IN LEZGIAN
Ioana Chitoran, Dartmouth College
Ayten Babaliyeva, Ecole Pratique des Hautes Etudes
Paper File Additional Files
  This paper reports on a preliminary acoustic description of high vowel syncope in one dialect of Lezgian, a NE Caucasian, Daghestanian language. Acoustic data from one speaker confirm the absence of a vowel in the syncope context, but traces of it remain visible (and audible) in the preceding stop release or fricative noise. This raises the question of possible vowel devoicing. It also suggests that a relevant account for the facts should be based on gestural overlap rather than deletion. In support of this hypothesis, two types of measurements are reported. First, vowel duration shows that even non-high vowels are considerably shortened when stress is shifted away from them, participating in a similar process as high vowels. Second, the duration of the inter-burst interval in resulting stop sequences varies depending on the stop place of articulation.
Poster VII-69 Language effects on the degree of visual influence in audiovisual speech perception
Yuchun Chen, Dept of Human Communication Sciences, UCL
Valerie Hazan, Dept of Phonetics and Linguistics, UCL
Paper File Additional Files
  This study investigated language factors in the use of visual information in speech perception in Mandarin-Chinese, Thai, Japanese and English, languages differing in their use of tone information. Adult participants were presented with the stimuli /ba/, /da/, /ga/ spoken by two English and two Mandarin-Chinese speakers. A syllable identification task was presented in auditory, visual and audiovisual (congruent and incongruent) conditions in clear and in noise. Chinese listeners used visual information in audiovisual speech processing to the same extent as English listeners, and the magnitude of the McGurk effect was the same across both groups in the noisy condition. Japanese and Thai participants showed a stronger McGurk effect in clear condition, which might be caused by the foreign-language effect as all speakers were non-native for them. The hypothesis that a lower reliance on visual cues is found for tone languages is not supported by these results.

Back to Conference Schedule