System-Larning
Accelerating the Rat's Learning Speed
          
           In 2002, we succeeded in training rats to push the
            levers on the robot to obtain food. For animals, learning to
            perform new behavior which was not equipped by nature is
            very hard. Therefore, in our previous experiment, an
            experimenter taught that behavior by pushing the levers and
            feeding a food pellet in front of the rats, and we then
            considered those rats might learn the robot functions. This
            learning process is very interesting. Therefore, we tried to train the rats to that behavior, pushing the levers to obtain food, by the robot autonomously demonstrating. We then developed a new operation generation algorithm that enabled the robot to autonomously demonstrate its functions. We called this algorithm "Accelerating the Rat's Learning Speed Algorithm".
Accelerating the Rat's Learning Speed Algorithm
          
           To train animals to difficult or complex behavior
            that would rarely occur spontaneously is difficult. However,
            it is possible to train such behavior by dividing learning
            process into small steps and increasing difficulty of them
            in step-by-step manner. This is what is called "shaping".
            "Shaping" is a method of operant conditioning proposed by
            Skinner. "Accelerating the Rat's Learning Speed Algorithm"is
            an operational pattern generation algorithm for the robot
            based on the concept of "shaping".Therefore, we divided the learning process into three steps. We then defined the local target of each step and constructed their operational patterns to increase the chances that the rats spontaneously performed the local target behavior.
Level 1:Reinforcement of rat's motivation
          
           
            Fig. 1 Providing a food
 
              Rats rarely move in a situation never previously
              experienced due to their natural sense of caution.
             
              The local target behavior of step 1 is active movement,
              the simplest kind of behavior. In this step, the food
              feeder routinely releases a food pellet. In our previous
              experiment, the rats that had obtained food in the
              experimental field moved actively compared to those that
              had not. Therefore, we believe that those constant
              feedings are effective in reinforcing the rat’s motivation
              for movement. When the total movement distance of the rat
              exceeds a threshold value, this step is finished and the
              next step is started.
Level 2:Conditioning the rat to approach to WM-6
          
           
            Fig. 2 Detection of the rat's approach
  
              The local target behavior of step 2 is the approach to the
              robot. To attract the rat’s interest in WM-6, the robot
              routinely moves to the front of the food feeder and the
              food feeder releases a food pellet at the moment the robot
              arrives there. We believe the rat learns the relationship
              between the robot and the feedings through these routine
              movements and feedings. It is then expected the rat would
              be interested in WM-6 and hence approach it.
             
              When the rat’s approach to WM-6 is detected, the robot
              moves to the front of the food feeder. To reinforce the
              approach of the rat to the robot, the food feeder then
              releases a pellet. After that the robot returns to the
              home position. 
             
              It is possible to detect the rat’s approach to the robot
              through the image processing. When the number of the
              detection and reinforcement approaches exceeds a threshold
              value, this step is finished and the next step is started.
Level 3:Conditioning the rat to push the levers
          
           
            Fig. 3 Narrowing the
 The
              local target behavior of step 3, final step, is pushing
              the levers on WM-6. At the beginning of this step, when
              the rat approaches the robot, the robot moves to the front
              of the food feeder and the food feeder then releases a
              pellet. After enough reinforcements, the rd (radius of
              approach detection area) is reduced every time the rat
              approaches the robot. We believe that the rat would then
              approach the robot close quarters and it is subsequently
              expected that the rat would occasionally push the levers
              on the robot.
             
              When the rat pushes the levers on WM-6, the robot moves to
              the front of the food feeder and the food feeder releases
              a pallet. In this way, it is possible to train the rat to
              push the levers on the robot. When the number of lever
              pushes exceeds a threshold value, this step is finished.
              Thus we then consider the rat learn to push the levers to
              obtain food. 
Evaluation Experiment
          
           We
              conducted an experiment for evaluating the accelerating
              rat’s learning speed algorithm. The experiment was
              conducted five trials using five rats. The rats were male
              albino-rats without any experimental experience and bred
              singly in breeding cages.Three of them (an experimental
              group:Rats 1~3) had the algorithm applied while the others
              (a control group:Rats 4, 5) had not. 
             
              We released the rat into the experimental field and
              started a trial. When
              the rat learn to push the levers to obtain food, we
              finished the trial. 
Results
 In
              this experiment, the rats in the experimental group
              learned to push the levers on WM-6 to obtain food. The
              learning processes of these three rats are shown in Figs.
              10~12. The rats total movement distance and the cumulative
              number of rat approaches and rat lever pushes are shown in
              these graphs.
             
              On the other hand, the rats in the control group didn't
              learn to push the levers on WM-6.
             
              Therefore, using the accelerating rat’s learning speed
              algorithm, we succeed in training a new behavior, "pushing
              levers on the robot" to rats by the robot autonomous
              demonstration. 
 
            Fig.4 Cumulative Record of Rat 1
 
            Fig.5 Cumulative Record of Rat 2
 
            Fig.6 Cumulative Record of Rat 3
 
 