Japanese

logo


1. Introduction    2. History of Waseda Talker Series     3. Mechanism     4. Demonstration of Talking Robot WT-7R and WT-7     5. Demonstration of Talking Robot WT-5 and WT-4


1. Introduction
conversation
 The purpose of this research is to clarify a human vocal mechanism from engineering viewpoints by reproducing the vocal movement using a talking robot, and to create the dynamic model. By using the mechanical talking robot, we could experiment human speech quantitatively. This model will lead to the production of cellular phones that can compress data by transmitting human vocal movement instead of human voices. Furthermore, the model will lead to developing medical training devices for vocally challenged people and learning devices for foreign languages.

Page Top


2. History of Waseda Talker Series

2-1. WT-1 (Waseda Talker No.1) (in 2000)

wt-1 We developed a talking robot WT-1 (Waseda Talker No.1) to reproduce a human vocal movement. WT-1 has vocal organs (the 1-DOF (degrees of freedom) lungs and 1-DOF vocal cords) and articulators (the 6-DOF tongue, 4-DOF lips, 1-DOF teeth, nasal cavity and 1-DOF soft palate) like a human. The total DOF is 15. The dimensions are about 1.2-1.3 times larger than an adult male's. WT-1 could speak Japanese vowels (/a/, /i/, /u/, /e/, /o/).



2-2. WT-1R (Waseda Talker No.1 Refined) (in 2001)

wt-1r We developed another talking robot WT-1R that improved on WT-1 for the realization of consonant sounds. WT-1R has vocal organs (the 1-DOF lungs and 1-DOF vocal cords) and articulators (the 6-DOF tongue, 4-DOF lips, 1-DOF teeth, nasal cavity and 1-DOF soft palate) like a human. The total DOF is 15. WT-1R could speak Japanese vowels (/a/, /i/, /u/, /e/, /o/) and some consonant sounds (/s/, /h/, /m/, /p/ and "Waseda").  

2-3. WT-2 (Waseda Talker No.2) (in 2002)

wt-2 We developed a new talking robot WT-2 (Waseda Talker No.2) for the production of human-like natural voices. WT-2 has the 1-DOF lung, 3-DOF vocal cords, 5-DOF tongue, 4-DOF lips, 1-DOF teeth, nasal cavity and 1-DOF soft palate: The total DOF is 15. The length of the vocal tract is about 175[mm] and almost same as an adult male's. Compared with the previous robots (WT-1 and WT-1R), WT-2 could speak Japanese vowels more clearly, and produce all Japanese consonant sounds.

2-4. WT-3 (Waseda Talker No.3) (in 2003)
wt-3
We developed a new advanced talking robot WT-3 (Waseda Talker No.3) that improved on WT-2. WT-3 consists of 1-DOF lungs and 3-DOF vocal cords and articulators (the 7-DOF tongue, 5-DOF lips, 1-DOF teeth, nasal cavity and 1-DOF soft palate), and could reproduce human-like articulatory motion; the total DOF was 18. WT-3 could produce vowels more clearly, and produce stops, fricatives and nasal sounds with the new flexible mechanisms that functioned as the human vocal tract area and the other mechanisms.


2-5. WT-4 (Waseda Talker No.4) (in 2004)

We developed a new anthropomorphic talking robot WT-4 (Waseda Talker No.4) that improved on WT-3. WT-4 had a human-like body to make the communication with a human more easily, and consisted of the total DOF was 19. We constructed an autonomous control method of WT-4 to mimic continuous human speech sounds by auditory feedback. In this method, the trajectory of each robot parameter was controlled so that the acoustic parameters (pitch, sound power, formant frequencies that are resonant frequencies of the vocal tract and have the peak of the output spectrum, and the timing of the switch between voiced and voiceless sounds) generated from the robot were close to those of human speech sounds.

2-6. WT-5 (Waseda Talker No.5) (in 2005)

We developed a new anthropomorphic talking robot WT-5 (Waseda Talker No.5). WT-5 consisted of 1-DOF lungs, 3DOF vocal cords and articulators (the 7-DOF tongue, 5-DOF lips, 1-DOF teeth, nasal cavity and 1-DOF soft palate), and could reproduce human-like articulatory motion; the total DOF was 18. We developed the mechanical lips and vocal cords with similar size and biomechanical structure to the human. We constructed an autonomous control method of WT-5 to mimic continuous human speech sounds including consonant sound by sensory and auditory feedback. In this method, we used tactile and intraoral pressure information to optimize the production of consonant sound. We also developed efficient optimization methods of auditory feedback by using speech recognition software.

2-7. WT-6 (Waseda Talker No.6) (in 2006)

We produced a new anthropomorphic talking robot WT-6 (Waseda Talker No.6). WT-6 consisted of 1-DOF lungs, 5-DOF vocal cords and articulators (the 5-DOF tongue, 4-DOF lips, 1-DOF jaw, nasal cavity and 1-DOF soft palate), and could reproduce human-like articulatory motion; the total DOF was 17. The length of the vocal tract is about 180[mm] and almost same as an adult male's. WT-6 has independent jaw opening/closing mechanism and three-dimensional tongue and vocal cavity made of thromoplastic rubber Septon (R) by Kuraray. The vocal cord model was also improved by adding new pitch control mechanism.

2-8. WT-7 (Waseda Talker No.7) (in 2007)

We developed a new anthropomorphic talking robot WT-7 (Waseda Talker No.7) mimicking human biological structure. WT-7 consisted of Vocal organ (the 4-DOF vocal cords and 1-DOF lungs) and articulators (the 7-DOF tongue, nasal cavity and 1-DOF soft palate, 5-DOF lips, and 1-DOF jaw), and could reproduce human-like articulatory motion; the total DOF was 19.Vocal cords, tongue and face were made of thromoplastic rubber Septon (R) by Kuraray. This material is so flexible and stretchable. Vocal cord model was also improved by adding new pitch control mechanism (80[Hz]). By adopting new linkage mechanism, the tongue mechanism could reproduce the oral cavity conformation more accurately than the previous one.


2-9. WT-7R (Waseda Talker No.7 Refined) (in 2008年)

We developed another talking robot WT-7R that improved on WT-7 for the clearness of vowels. WT-7R consisted of Vocal organ (the 5-DOF vocal cords and 1-DOF lungs) and articulators (the 7-DOF tongue, nasal cavity and 1-DOF soft palate, 4-DOF lips, and 1-DOF jaw), and could reproduce human-like articulatory motion; the total DOF was 19. we constructed the tongue mechanism with higher density. The deformation is 7[mm] and sufficient for reproducing the vocal tract shape. In addition, the space covered by the elastic tongue is filled with the ethylene glycol to improve vocal tract resonance. As a result, the robot could produce more clear vowels, especially the bandwidth of /o/ vowel narrowed by 50[Hz].

2-10. WT-7RII (Waseda Talker No.7 Refined II) (in 2009年)

We developed WT-7R II (Waseda Talker No.7 Refined II) with three dimensional mechanism that improved on WT-7R for production of bilabial plosive.WT-7RII consisted of Vocal organ (the 4-DOF vocal cords and 1-DOF lungs) and articulators (the 7-DOF tongue, nasal cavity and 1-DOF soft palate, 5-DOF lips, and 1-DOF jaw), and could reproduce human-like articulatory motion; the total DOF was 19. The lip mechanism was constructed mimicking human mechanism, pushing and pulling circularly with 5 link mechanism. Each link and Septon's lip were connected with vise mechanism. This lip could reproduce total closure, as well as 5 vowels opening area(from 140 to 840[mm2]). As a result, WT-7RII could pronounce bilabial plosive consonant /p/.

    


3. Mechanism
The flow from the lungs generates a sound by vibrating the vocal cords. The articulators (teeth, lips, teeth, nasal cavity and soft palate) change the sound. 

WT-7RII (Section) Vibration of WT-5's Vocal Cords
(High-Speed Camera/1000[fps])



4. Demonstration of Talking Robot WT-7RII,WT-7R and WT-7
   Click the following pictures and see the Talking Robot movie.

 「papipupepo」(WT-7RII)
MPEG 1.02 MB
   aiueo(WT-7R)
   MPEG 1.59 MB
   aiueo(WT-7)
   MPEG 1.35 MB



 
5. Demonstration of Talking Robot WT-5 and WT-4
   Click the following pictures and see the Talking Robot movie.
Human Vocal Mimicry
aiueo(WT-5)
MPEG 1.49 MB
sasisuseso(WT-5)
MPEG 1.40 MB
papipupepo(WT-5)
MPEG 1.19 MB
Human Vocal Mimicry
"hassei" (WT-4)

MPEG 0.92 MB


Acknowledgement
  This research has been supported by Grant-in-Aid for Scientific Research (KAKENHI). We would like to express our thanks to all co-researchers ,Kuraray Co, Ltd. ATR, Okino Industries, Ltd., Chukoh Chemical Industries, LTD. and SolidWorks Japan K. K. for helping us to develop the robot's hardware.


Relation
Japan Science and Technology Agency (JST)
Core Research for Evolutional Science and Technology (CREST)
"Creating the Brain" Research Project"Task Planning Mechanism of Speech Motor Control"

SolidWorks Japan K.K.

ATR
Okino INDUSTRIES
Chukoh Chemical Industries, LTD.
Kuraray Co, Ltd   Septon
Honda Lab (Waseda University)