Multilingual Speech Database 2002
Multilingual Speech Database 2002

Contains 7 languages, All spoken by native speakers.

Benefits / Features


The best product for speech recognition systems in multiple languages!

This database contains recordings in 7 languages. English, French, Japanese and more.
Each sentences are spoken by multiple speakers. (The number of speakers is different depending on the language.)

Product Inquiry

Specifications / Details



All speakers are native speakers of their respective languages. The speakers used their native accents, which accounts for the difference in accents in the recordings. All vocalized noises produced by the speakers during the recording sessions due to swallowing and/or breathing have been preserved.


The recording was done in a sound-proof room, using a low-noise precision condenser microphone. The active speech level was equalized to -26dB below the overload point based on the ITU-T recommendation P.56 algorithm.
As the frequency components of speech samples are preserved up to 8kHz (The recording system ensures the flatness up to 12kHz including microphone.), the database can be apply to the evaluation not only to a conventional 3.4kHz telephone system but to a 7kHz wide band telephone system using ISDN (ITU-T Rec. G.722).


In accordance with ITU-T Rec. P.800, short non-technical sentences were compiled from newspapaers and magazines. Some of the sentences were slightly modified to shorten their length. We cannot guarantee linguistic accuracy that may be expected in lanugage research and learning.


All speech signals have been recorded on CD-ROM disks as the PC binary files (little-endian Byte order)
Sampling Rate: 16kHz Amplitude resolutin: 16bits
(Remarks: The disks CANNOT be played on standard audio CD players. All data can be retrieved by PCs or workstations with a built-in CD drive.)

American English*1 Male/Female No.1 ~ 5
300 sentences
Male/Female No.6 ~ 10
170 sentences
British English*2 Male/Female No.1 ~ 5
200 sentences
French*2 Male/Female No.1 ~ 5
200 sentences
German*2 Male/Female No.1 ~ 5
200 sentences
Japanese*2 Male/Female No.1 ~ 5
200 sentences
Spanish Male/Female No.1 ~ 5
150 sentences
Chinese Mandarin Male/Female No.1 ~ 5
180 sentences
Cantonese Male/Female No.1 ~ 5
100 sentences

With a few exceptions, each file is approximately four seconds long.

*1 :48 sentences, each pronounced by 10 male and 10 female speakers.
*2 :In the English, French, German, and Japanese samples, all speakers pronounce the same set of sentences.


Multilingual Speech Database 2002 Please contact us.


Product Inquiry

Share on Twitter Share on Facebook