Last updated 2003.12.15 Mon.

Emotion Expression Humanoid Robot WE-4R
( Waseda Eye No.4 Refined )

1. Objective
We have been developing the Emotion Expression Humanoid Robots since 1995 in order to develop new mechanisms and functions for a humanoid robot having the ability to communicate naturally with a human by expressing human-like emotion. In 2003, 9-DOFs Emotion Expression Humanoid Arms were developed to improve the emotional expression. The arms were integrated with WE-4 (Waseda Eye No.4) to develop the Emotion Expression Humanoid Robot WE-4R (Waseda Eyes No.4 Refined) that could express its emotions by using its facial expressions, torso, and arms.

2. Hardware Overview
Fig. 1 and Fig. 2 present the hardware overview of the Emotion Expression Humanoid Robot WE-4R. It has 47-DOFs (Arms:18, Waist: 2, Neck: 4, Eyeballs: 3, Eyelids: 6, Eyebrows: 8, Lips: 4, Jaw: 1, Lungs: 1) and a lot of sensors which serve as sense organs (Visual, Auditory, Cutaneous and Olfactory sensation) for extrinsic stimuli. Descriptions of each part are as follows.

Fig. 1 WE-4R (Whole View) Fig. 2 WE-4R (Head Part)
Fig. 1 WE-4R (Whole View) Fig. 2 WE-4R (Head Part)

2.1 Eyeballs and Eyelids
The eyeballs have 1-DOF for the pitch axis and 2-DOF for the yaw axis. The maximum angular velocity of eyeballs is similar to a human with 600[deg/s] for the eyeballs. The eyelids have 6-DOF. WE-4 can rotate its upper eyelid in order to be able to express using the corner of robot's eye. The maximum angular velocity of opening and closing eyelids is similar to a human with 900[deg/s] for the eyelids. Furthermore, this robot can blink within 0.3[s], which is as fast as a human does.
For miniaturization of the head part, we newly developed an Eye Unit that integrated eyeballs parts and eyelids parts. Moreover, in the Eye Unit of WE-4, the eyeball pitch axis motion mechanically synchronizes opening and closing upper eyelid motion. Therefore, we can control coordinated eyeball-eyelids motion by hardware.

2.2 Neck
WE-4fs neck has 4-DOF, which are the upper pitch, the lower pitch, the roll and the yaw axis. WE-4 can stretch and pull its neck using the upper and lower DOF like a human. The maximum angular velocity of each axis is similar to a human's at 160[deg/s].

2.3 Arms and Hands
WE-4R has 9-DOFs Emotion Expression Humanoid Arms. The arm consists of a base shoulder part (pitch and yaw axis), a shoulder part (pitch, yaw and roll axis), an elbow part (pitch axis), and a wrist part (pitch, yaw and roll axis). By using the 2-DOFs of the base shoulder, it can move the whole shoulder up and down, back and forth. This enables WE-4R to do such movements as squaring its shoulders when angry or shrugging its shoulders when sad. Therefore, WE-4R can express its emotions effectively by using its arms.
As for the hands, it can hold 100[g] by the electromagnets attached on the palm.

2.4 Trunk
Recently we added a waist that has 2-DOF, the pitch and yaw axes, to WE-4. By adding a waist, WE-4 pursues the targets using not only coordinated head-eye motion but also coordinated waist-head-eye motion with V.O.R. In addition, WE-4 produced emotional expression with not only facial expressions but also the upper-half part of its body.
Additionally, we set a lung in the chest of WE-4. It can display breathing motion which expresses more emotional motion in addition to breathing air for olfactory sensation.

2.5 Facial Expression Mechanisms
WE-4 expresses its facial expression using its eyebrows, lips, jaw, facial color and voice. The eyebrows consist of flexible sponges, and each eyebrow has 4-DOF.
We used spindle-shaped springs for WE-4fs lips. The lips change their shape by pulling from 4 directions, and WE-4fs jaw that has 1-DOF opens and closes the lips.
For facial color, we used red and blue EL (Electro Luminescence) sheets. We applied them on the cheeks. WE-4 can express red and pale facial colors.
For the voice system, we used a small speaker that was set in the jaw. The robot voice is a synthetic voice made by LaLaVoice 2001 (TOSHIBA Corporation).

2.6 Sensors
(1) Visual Sensation
WE-4 has two color CCD cameras in its eyes. The images from its eyes are captured to a PC by an image capture board. WE-4 calculates the gravity and area of the targets. WE-4 can recognize any color as the targets and it can recognize four targets at the same time. If there are multiple target colors in the robot's view, WE-4 follows the largest target.

(2) Auditory Sensation
We used two small condenser microphones as the auditory sensation. WE-4 can localize the sound directions from the loudness and the phase difference between the right and the left.

(3) Cutanious Sensation
WE-4 has tactile and temperature sensations in the human cutaneous sensation. We used the FSR (Force Sensing Resistor) as tactile sensation FSR is able to detect even very weak forces, and is a thin and light device. We devised a method for recognizing not only the magnitude of the force, but also the difference of the touching manner that are "Push", "Stroke", "Hit", by using a 2 layers structure with FSR. On the other hand, WE-4 has a Thermistor the temperature sensor. FSRs are also installed on the palms to detect whether it has been contacted or not.

(4) Olfactory Sensation
We used the four semiconductor gas sensors as the olfactory sensation. We set them in WE-4's nose. WE-4 can recognize the smells of alcohol, ammonia and cigarette smoke.

2.7 System Configuration
Fig. 3 shows the total system configuration of WE-4R. We used three computers (PC/AT compatible) connected to each other by Ethernet. PC1 captures the visual images from CCD cameras and then calculates the center of gravity and brightness of the target, and sends them to PC3. PC2 obtains and analyzes the sounds from microphones using a soundboard and sends them to PC3. PC3 obtains and analyzes the outputs from the olfactory and cutaneous sensations using A/D boards. Then, PC3 determines the mental state according to the stimuli. In addition, PC3 controls all DC motors, the facial color and the voice system according to the visual and mental information.

Fig. 3 System Configuration
Fig. 3 System Configuration

3. Facial Expressions
We use the Six Basic Facial Expressions of Ekman in the robot's facial control, and have defined the seven facial patterns of "Happiness", "Anger", "Disgust", "Fear", "Sadness", "Surprise", and "Neutral" emotional expressions. The strength of each emotional expression is variable by a fifty-grade proportional interpolation of the differences in location from the "Neutral" emotional expression. The speed of the arm movement is changed according to the emotion of the robot. Therefore, the emotion of the robot can be expressed by both the posture and the speed of the arms. WE-4R has the emotional expression patterns shown in Fig. 4.

Fig. 4a Happiness Fig. 4a Happiness Fig. 4b Fear Fig. 4b Fear
(a) Happiness (b) Fear
Fig. 4c Suprised Fig. 4c Suprised Fig. 4d Sadness Fig. 4d Sadness
(c) Surprise (d) Sadness
Fig. 4e Anger Fig. 4e Anger Fig. 4f Disgust Fig. 4f Disgust
(e) Anger (f) Disgust
Fig. 4g Neutral Fig. 4g Neutral
(g) Neutral
Fig. 4 Seven Basic Facial Expressions

4. Mental Modeling
4.1 Approach
The Mental Dynamics, which is the mental transition caused by the internal and external environment of the robot, is extremely important in the emotional expression. Therefore, in construction of the mental model, we considered that the human brain model had a three-layered model that consisted of the reflex, emotion and intelligence. And, we are approaching the mental model from the reflex. Moreover, we divided the emotion into "Learning System", "Mood" and "Dynamic Response" according to the working duration. The Mental Dynamics, which is the mental transition caused by the internal and external environment of the robot, is extremely important in the emotional expression. Therefore, in construction of the mental model, we considered that the human brain model had a three-layered model that consisted of the reflex, emotion and intelligence. And, we are approaching the mental model from the reflex. Also, we divided the emotion into "Learning System", "Mood" and "Dynamic Response" according to the working duration.
Moreover, in order to realize bilateral interaction between human and robot, we based our research on "A.H.Maslow's Hierachy of Needs", and introduced the Need Model consisting of the "Appetite", the "Need for Security", and the "Need for Exploration". Consquently, the robot can behave according to its need.

Fig. 5 Brain Dynamics
Fig. 5 Brain Dynamics

4.2 Information Flow
WE-4R changes its mental state according to the external and internal stimuli, and expresses its emotion using facial expressions, facial color and body movement. We introduced an information flow into the robot shown in Fig. 6. There are two big flows. The one is the flow caused from the external environment. And, the other is the flow caused from the robot internal state. Furthermore, we introduced the Robot Personality because each human has deferent personality. The Robot Personality consists of the Sensing Personality and the Expression Personality. The need and the emotion are a two-layered structure, and the need is in a lower layer than the emotion because we thought that the need was nearer to the instinct than the emotion. Furthermore, the need and emotion affect each other through the Sensing Personality.

Fig. 6 Information Flow of the Mental Modeling
Fig. 6 Information Flow of the Mental Model

4.3 Personality and Learning System
The Robot Personality consists of the Sensing Personality and the Expression Personality. The former determines how a stimulus works the mental state. And, the later determines how the robot expresses its emotion. We can easily assign these personalities. Therefore, it's possible to easily obtain a wide variety of the Robot Personalities. Moreover, we introduced the "Learning System" in order for the robot to learn the experiences and construct its personality based on its experiences dynamically.

4.4 Emotion Vector and Mood Vector
We adopted the 3D mental space, which consists of a pleasantness axis, an activation axis and a certainty axis, shown in Fig. 7. The vector E named the "Emotion Vector" expresses the mental state of WE-4. Furthermore, we newly introduce the "Mood Vector" M that consists of a pleasantness axis and an activation axis.
The pleasantness component of the Mood Vector changes by the current mental state. But, in order to describe the activation component of the Mood Vector, we introduced the internal clock that is a kind of automatic nerve system.

4.5 Equations of Emotion
he Emotion Vector E is described the Equations of Emotion if the robot senses the stimuli. We considered that the mental dynamics which is a transition of a human mental state might be expressed by similar equations to the equation of motion. Therefore, we expanded the equations of emotion into the second order differential equation which modeled on the equation of motion. The robot can express the transient state of the mental state after the robot senses the stimuli from the environment. We can obtain the complex and various mental trajectories.
Finally, we mapped out 7 different emotions in the 3D mental space as in Fig. 8. WE-4 determines the emotion by the Mental Vector passing each region.

Fig. 7 Mental Space
Fig. 7 Mental Space

Fig. 8 Emotional Mapping
Fig. 8 Emotional Mapping

4.6 Need Model
Bilateral interaction is important for natural communication between human and robot. We considered that active behavior of robot was necessary to realize bilateral interaction. Therefore, we introduced the Need Model to the robot mental model. The need state of a robot is described by the matrix N named the "Need Matrix". The "Need Matrix" is described as a first order difference equation. Though the robot need consists of the "Appetite", the "Need for Security" and the "Need for Exploration" in this study, the need matrix is expandable depending on the number of need factors.

(1) Appetite
The appetite is based on the total consumed energy that is described as the sum of the basal metabolism energy and output energy. We considered that metabolism energy was determined by the robot's emotional state, and the output energy of the robot was determined by internal or external stimuli such as the total electric current.

(2) Need for Security
The need for security is a type of the defense behavior. The defense reflex of withdrawal from strong stimuli is the similar reaction.However, the need for security generates the defense behavior for long-term stimuli. When a robot senses dangerous stimuli from the environment for a long period, the robot can withdraw from the dangerous stimuli or express a defense behavior even if the stimuli are too weak to cause the defense reflex. We realized the Need for Security by learning the position and strength of the stimuli when a robot felt stimuli from the environment.

(3) Need for Exploration
When humans and animals encounter a new situation or a new object, they express exploratory behavior out of their curiosity because the need for exploration is high. We realized the need for exploration by learning of the relation between the visual information and target property.

(4) Behavior by Need
The robot can actively generate and express its behavior based on its need in order to satisfy its need. And, the robot with need continues to exhibit the same behavior until the robot satisfies its need as a result of active behavior. We also considered that the need was one of the internal stimuli to the robot. By assigning the Sensing Personality for the need, the need affect the mental state.