Accelerating the Rat's Learning Speed

 In 2002, we succeeded in training rats to push the levers on the robot to obtain food. For animals, learning to perform new behavior which was not equipped by nature is very hard. Therefore, in our previous experiment, an experimenter taught that behavior by pushing the levers and feeding a food pellet in front of the rats, and we then considered those rats might learn the robot functions. This learning process is very interesting. 
 Therefore, we tried to train the rats to that behavior, pushing the levers to obtain food, by the robot autonomously demonstrating. We then developed a new operation generation algorithm that enabled the robot to autonomously demonstrate its functions. We called this algorithm "Accelerating the Rat's Learning Speed Algorithm".

Accelerating the Rat's Learning Speed Algorithm

 To train animals to difficult or complex behavior that would rarely occur spontaneously is difficult. However, it is possible to train such behavior by dividing learning process into small steps and increasing difficulty of them in step-by-step manner. This is what is called "shaping". "Shaping" is a method of operant conditioning proposed by Skinner. "Accelerating the Rat's Learning Speed Algorithm"is an operational pattern generation algorithm for the robot based on the concept of "shaping".
 Therefore, we divided the learning process into three steps. We then defined the local target of each step and constructed their operational patterns to increase the chances that the rats spontaneously performed the local target behavior.

Level 1:Reinforcement of rat's motivation

Fig. 1 Providing a food

  Rats rarely move in a situation never previously experienced due to their natural sense of caution.
  The local target behavior of step 1 is active movement, the simplest kind of behavior. In this step, the food feeder routinely releases a food pellet. In our previous experiment, the rats that had obtained food in the experimental field moved actively compared to those that had not. Therefore, we believe that those constant feedings are effective in reinforcing the rat’s motivation for movement. When the total movement distance of the rat exceeds a threshold value, this step is finished and the next step is started.

Level 2:Conditioning the rat to approach to WM-6

Fig. 2 Detection of the rat's approach

   The local target behavior of step 2 is the approach to the robot. To attract the rat’s interest in WM-6, the robot routinely moves to the front of the food feeder and the food feeder releases a food pellet at the moment the robot arrives there. We believe the rat learns the relationship between the robot and the feedings through these routine movements and feedings. It is then expected the rat would be interested in WM-6 and hence approach it.
  When the rat’s approach to WM-6 is detected, the robot moves to the front of the food feeder. To reinforce the approach of the rat to the robot, the food feeder then releases a pellet. After that the robot returns to the home position. 
  It is possible to detect the rat’s approach to the robot through the image processing. When the number of the detection and reinforcement approaches exceeds a threshold value, this step is finished and the next step is started.

Level 3:Conditioning the rat to push the levers

Fig. 3 Narrowing the

 The local target behavior of step 3, final step, is pushing the levers on WM-6. At the beginning of this step, when the rat approaches the robot, the robot moves to the front of the food feeder and the food feeder then releases a pellet. After enough reinforcements, the rd (radius of approach detection area) is reduced every time the rat approaches the robot. We believe that the rat would then approach the robot close quarters and it is subsequently expected that the rat would occasionally push the levers on the robot.
  When the rat pushes the levers on WM-6, the robot moves to the front of the food feeder and the food feeder releases a pallet. In this way, it is possible to train the rat to push the levers on the robot. When the number of lever pushes exceeds a threshold value, this step is finished. Thus we then consider the rat learn to push the levers to obtain food. 

Evaluation Experiment

 We conducted an experiment for evaluating the accelerating rat’s learning speed algorithm. The experiment was conducted five trials using five rats. The rats were male albino-rats without any experimental experience and bred singly in breeding cages.Three of them (an experimental group:Rats 1~3) had the algorithm applied while the others (a control group:Rats 4, 5) had not. 
  We released the rat into the experimental field and started a trial. When the rat learn to push the levers to obtain food, we finished the trial. 

Results

 In this experiment, the rats in the experimental group learned to push the levers on WM-6 to obtain food. The learning processes of these three rats are shown in Figs. 10~12. The rats total movement distance and the cumulative number of rat approaches and rat lever pushes are shown in these graphs.
  On the other hand, the rats in the control group didn't learn to push the levers on WM-6.
  Therefore, using the accelerating rat’s learning speed algorithm, we succeed in training a new behavior, "pushing levers on the robot" to rats by the robot autonomous demonstration. 

Fig.4 Cumulative Record of Rat 1

Fig.5 Cumulative Record of Rat 2

Fig.6 Cumulative Record of Rat 3