885095982

885095982



in terms of attenuation is demonstrated lo bc strongly dependent upon the forcing frequency and the location and naturę of the error sensors. For example, the use of the accelerometers located close to the control actualors, while providing high local reduction, can lead to an inerease of the global response. [Work supported by NASA Langley and DARPA/ONR]

TUESDAY AFTERNOON, 30 APRIL 1991

INTERNATIONAL B, 1:00 TO 4:15 P.M.

Session 3SP

Speech Communication: Speech Processing

Astrid Schmidt-Nielsen, Chair

Codę 5532, Nawal Research Laboratory, Washington, DC 20375-5000

Contributed Papers

IKK)

3SP1. Profiling vectors for speaker identification. Harry Hollien and Ming Jiang (Linguistics and Inst. for Adv. Study of the Commun. Processes, Univ. of Florida, Gainesville, FL 32611)

When speaker vcrification is the issue of interest, it is possible to focus on signal analysis irrespcctive of the speech related features it contains. Such approaches are appropriate in this case because system dislortions are minimal, noise is Iow, talkers are cooperalive, and very sophisticated equipmcnt is available. Not so for speaker identification. Here extensive channel and speaker distortions (including noise) can be expccted; speech is noncontemporary and speakers usually uncoopera-tive. Hence, the signal is so distorted or masked, the usual processing tcchniąues cannot be expected to be very useful. The approach to speaker identification demonstrated in this paper is threefold. First, it is assumed that the signal contains speech features that are robust (i.e., resistant to noise and distortion) and unique to the talker. These idio-syncracies arc based on speaker’s anatomy, physiology, and habitual communicative pattems. Second, it is postulatcd that, while there may be no single attribute within a person’s speech/voice that would permit theni to be differentiated from all other speakers under any set of eon-ditions, the simultaneous use of a large scries of fcaturc analyses may permit identification. Finally, it has beconie possible to reduce bias among the vectors by the normalization of data. In turn, this approach leads to a vcry effective two-dimensiona! profile wherein the unknown speaker musi first be identified and then comparisons madę to known talkers. A system of this lypc has becn structured and tested; it is based on four natura! speech veclors, cach containing 20-40 paranicters. Data regarding this generał approach and these vcctors have been reported prcvious!y. This presentation will focus on the effects (on efficicnt speaker identification in the field) of normalizing the vector data and reducing it to a two-dimensional profile.

1:15

3SF2. Vowel formant tracking for speaker identification. Ming Jiang and Min Shi (Inst. for Adv. Study of the Commun. Proc., Univ. of Florida, Gainesville, FL 32611)

The first two or three spectral peaks, or formants, are crucia) in delermining the vowel quality. In tum, accurate determination vowel formants quality is important to effective speaker identification task. A vowel formant tracking vector (VFT) was developed for the speaker identification (SAUSI) profile. Specifically, the speech spectrum is ob-tained frame-by-frame by using an LPC algorithm with the first three formant frequencies for each frame calculated. The underlying assump-tion was that the vowels will exhibil a contiguous formant frcquency transition from frame-to-frame and, hence, can be separated from con-sonants for the cited formant measurcments. In order to carry out this task, the frequency rangę 0-5000 Hz is divided into 34 semitone bins and three histograms are obtained for first three vowel formants. In turn, these histograms provide an estimation of generał quality of the vowels spoken by each speaker being cvaluated. The rcsult is that the interspeaker differences are large enough to permit identification of the target speaker while the intraspeaker differences are fairly smali evcn for text independent speech. The algorithm utilized will be presented as will dala demonstratmg that this VFT vector is robust enough to effectively perform the speaker identification task.

1:30

3SI*3. SAMREC0: A C30-based reference connected-word recognizer for the evaluation of speech databases. F. Capman and G. Chollet (C.N.R.S. URA 820, Telecom Paris, Dept. Signal, 46 rue Barrault, 75634 Paris Cćdex 13, France)

One of the objcctivc$ of the ESPRIT-SAM project is the elaboration of speech databases for the evaluation of recognizers. In this framework, a reference system [G. Chollet and C. Cagnoulet, “On the evaluation of speech databases using a reference system,” ICASSP, 1982], based on dynamie programming algorithm, was modified to accept connected words [G. Chollet and C. Montacie, “Evaluating speech recognizers and databases,” NATO-ASI, 1988]. This software, which is called SAMREC0 by the SAM speech input assessment group, is now imple-mented using a T I. TMS320C30-based PC-board, so that it can be used efficiently on the SAM PC-AT Workstation. Some rcsults will be pre-sented on the evaluation of the first SAM database EUROMO. This daiabase was recorded in quiet conditions and very few classification errors are observed. Work is under development to simulate noisy conditions using the same database, in order that the limits of the reference or other systems could be measured.

1:45

3SP4. Feature detection using a connectionist network. Gary Bradshaw (Dept. of Psych., Univ. of Colorado. Boulder, CO 80309) and Alan Bell (Univ. of Colorado, Boulder, CO 80309)

1891


J. Acoust. Soc. Am., Vol. 89, No. 4, Pt 2, April 1991


121 st Meeting: Acoustical Society of America


1891




Wyszukiwarka

Podobne podstrony:
Monday. 16 Seplember is planning lo bc publislicd in Bclgium. Draft dccrec hasbccn alrcady Identific
99 Seasonal and experiinental changes in Msum have also been interpreted in terms of body remodellin
Picture0 Color Expression and Special Effects Color is expressed in terms of differences in gray Wh
99 Seasonal and experiinental changes in Msum have also been interpreted in terms of body remodellin
2 11 Negotiations should be a smooth process. Efficiency is assessed in terms of the length of negot
What can TPM do for YOU? TPM is based on the ideas of optimizing equipment effectiveness, achieving
2 11 Negotiations should be a smooth process. Efficiency is assessed in terms of the length of negot
16 FUTURĘ There is still much uncharted territory in terms of design and wearables. There is quite a
supposed to prepare the production process in terms of technology, i.e. product construction, t
Udział przewoźników wg pracy przewozowej Share of RUs in terms of transport performance I-III
Udział przewoźników wg liczby pasażerów/ Share of RUs in terms of number of passengers I-III
Rys. 6. Rzut suszarni chmielu w Wilkowie (woj. lubelskie)COMPARISON OF BUILDING MATERIALS IN TERMS O
Ieytbook Design And hYdluatioii The basie characteristics of any school textbook must be presented i
Obszary stosowania rachunkowości partycypacyjnej 85 a “monopoly” in terms of knowledge relating to a
8 Introduction research papers reflect on organizational creativity in terms of discovery and ex-plo
In terms of the regional Outlook, to be running slightly ahead of bo although, interestingly, p
tion of thc shell. Thesc questions are revisited here and the elastic re-sponsc of the shell in term
Admittance Formulation While some Circuit theory problems are easily solved in terms of impedances,

więcej podobnych podstron