BAYESIAN FRAMEWORK FOR VOICING ALTERNATION & ASSIMILATION STUDIES ON LARGE CORPORA IN FRENCH

Martine Adda-Decker1 & Pierre André Hallé2
1LIMSI/CNRS bat. 508 F-91403 Orsay cedex; 2LPP/CNRS 19 rue des Bernardins, 75005 Paris

ID 1562
[full paper]

The presented work aims at exploring voicing alternation and assimilation on large corpora using a Bayesian framework. A voicing feature (VF) variable has been introduced whose value is determined using statistical acoustic phoneme models (3-state gaussian mixture Hidden Markov Models). For all relevant consonants, i.e. oral plosives and fricatives, their surface form voicing feature is determined by maximising the acoustic likelihood of the competing phoneme models. A voicing alternation (VA) measure counts the number of changes between underlying and surface form voicing features. Using a corpus of 70h of French journalistic speech, an overall voicing alternation rate of 2.7% has been measured, thus calibrating the method's accuracy. VA rate remains below 2% word-internally and on word starts and raises up to 9% on lexical word endings. In assimilation contexts rates grow significantly (>20%) highlighting regressive voicing assimilation. Results exhibit a weak tendency for progressive devoicing.

Extra Files: