Introduction of the project

As we all know, artificial intelligence is more and more used, but also more and more powerful. So I had the idea to use it to increase the security, comfort and speed of detection of employees of a company. Nowadays, the arrival of employees in the morning, looks more or less like this:

And my project is simple, it is that it looks more like this:

I started from the example of Amazon Go , which allows you to do your shopping without having to physically go to the cashier. I then told myself that it should be possible to detect the presence of an employee, his behavior (for his safety), without having to go through a physical barrier or human control. The idea is to be able to detect the face of the employees, and their gestures, in order to identify them, connecting that data to a databasis, through an AI.

Face and gesture recognition

Face recognition

The first part of the project is to recognize the faces of employees. A ml5.js tool exists to be able to recognize a subject, through video. This example uses Mobile Net and ml5.js, it is available here. The code below shows how it is possible to detect an object, thanks to a prediction setting, related to a database.

To be effective in a business, we have to separate from MobileNet, which only classifies common images, and train our AI to recognize the employees of the company. Let's take the example of a company of 3 people, as it is said on TowardsDataScience.com : "To train your artificial intelligence model, you need a collection of images called a dataset. A dataset contains hundreds to thousands of sample images of objects you want your artificial intelligence model to recognize."

If we say that there are 3 employees: Lila, Marc and Rob, we have to do the following:

In fact, there are 900 pictures of each employees, which correspond to the raw data, and 200 pictures which are used to test the AI recognition. Then, if you want to know how to code it, there is a complete Python tutorial here which will allows you to train your own artificial intelligence, so you can connect it to the video recognition tool.

Even if 900 pictures seems a lot, with a simple iPhone you can get more than 100 pictures in about 10 seconds, as you can see below:

Gesture recognition

To go further, it would be interesting to increase the accuracy percentage of the detection by adding to the dataset of each employee, an analysis of his gestures. It is possible with TensorFlow.js and PoseNet to analyze the gestures of an individual with 17 points as you can see below (and see here ):

The 1st step is to load both PoseNet and TensorFlow libraries by following the code below:

Then there are two choices. Either you can select to detect only one person at a time

And here is the related algorithm for detecting one person only:

Or you can detect several persons at the same time

And here is the related algorithm for serval persons:

The aim is to collect and connect these data into our dataset to be more accurate on employee's detection. Of course, it will be essential to train the AI with employee's gesture, by filming them for example. As far as I am concerned, I do not have the level (yet) to do so. But any suggestion will be appreciated in our suggestion box .

To go further

Different applications of this kind of AI are numerous: security, airport safety, control... In the video below, you will see that facial (and gesture) recognition open a lot of opportunities, but there are still issues:

And as the following video shows, it looks like police officers and China's state have already developed a similar technology to control their citizens, so, do we have to be affraid, or is it just the future? :