🚩 Sequence of activities: 4 levels of machine autonomy

Equipment:

Thymio robots
1 PC/robot
Arena(s)

‍

Strengths:

Global overview of robotics and AI
Clarifies the different concepts (AI ≠ programming; supervised learning ≠ reinforcement learning)

‍

Duration :

3 hours

‍

Age :

From age 10

Configurations:

The four manipulations use configurations that already exist in the software.

This educational sequence of four experiments provides a clear understanding of the different ways of controlling a machine, from remote control (total lack of autonomy for the machine) to reinforcement learning (the machine learning "on its own"). However, we show that regardless of the level, it is humans who remain in control of the machine, as only they can set a goal for it.

We will also clearly explain the difference between what is now called "Artificial Intelligence," namely systems where there is machine learning (levels 3 and 4), as opposed to deterministic programming (level 2).

The supporting slides provide additional instructions for handling and illustrate the four levels with summary diagrams.

These four steps are:

1. Remote control: the user loads the "Remote Control" configuration and then chooses the actions that the robot will perform. They also familiarize themselves with the software. The robot has no autonomy, as it is the human who chooses its actions at all times.

2. Programming: the user loads the "Manual editing - Obstacle avoidance" configuration and must then decide on the robot's decision rules depending on whether its sensors measure that it is blocked or unblocked (the connections to be set up in what turns out to be a mini artificial neural network must be selected). The user sets the program and then presses the "Autonomous" button to test it. The "Reward" and "Level" displays show values that increase as the robot moves forward without getting stuck: the activity can be presented as a game in which the goal is to maximize the level that the robot will reach (NB: the "level" is the average of the "rewards" obtained in one minute; this concept of reward will be revisited in level 4, "reinforcement learning").. By doing this activity, the user has "programmed" their Thymio robot: the robot can then operate autonomously, but will always execute what has been programmed. The activity can be repeated with other configurations in the "Manual Editing" category, which increase in difficulty. The need for a robot to have sensors in order to be autonomous can also be emphasized.

3. Supervised Learning: users load the "Train your Thymio" configuration and train their robots to navigate a circuit. The main difference with the previous "programming" level is that users no longer directly set the calculation rules for analyzing sensor data; instead, they simply provide examples from which the AI finds ("learns") the rules it needs to use on its own. See the Thymio the Fearful activity for more details on Supervised Learning.

4. Reinforcement Learning: users load the "Reinforcement Learning - Obstacle Avoidance" configuration and simply press the "Autonomous" button. The robots begin to learn, no longer from examples provided by humans, but from their own experiences, which they accumulate by conducting their own "explorations," i.e., through trial and error. The reward is displayed ( along with the level, which calculates an average reward): it is this reward that the AI seeks to maximize. The reward is not part of the AI; it is an auxiliary program that analyzes the robot's movements and blockages. Humans are therefore not absent from the loop, as they are the ones who decided on the reward mode in order to train the AI to behave as they decided. More details on this level can also be found in the Arena activity.

‍

Download

Related Curricula

Your cart

Discover our
resources