Currently, face recognition technology is becoming more widely used. Everyone knows examples of using this technology as a means of authentication or in order to search for certain people (criminals, for example). But we are faced with it in daily life. When you upload photos, Facebook finds your friends on them and offers to tag them. Or Google in its photo service finds people in your photos and can show a collection with a specific person.

The development of biometric authentication systems is a logical consequence of the development of technology and society. The authentication system by login and password no longer meets modern reliability requirements and is not convenient for the user (the need to remember or store the login and password).

The process can be divided into the following steps:

  1. Face detection in the image (Haar cascade classifier, histogram of oriented gradients, convolutional neural network, etc.);
  2. Definition key points on the face (landmarks);
  3. Finding unique facial features and recording them in a simple and understandable form for the system (for example, constructing a vector with 128 dimensions (embedding));
  4. Comparison received information with the database.

The most popular Python libraries that allow you to do these steps are OpenCV and Dlib.

If we talk about the accuracy of facial recognition systems, then you can refer to the data of the National Institute of Standards and Technology (NIST) under the US Department of Commerce. Testing of systems is carried out by qualified specialists on closed datasets. When evaluating face recognition algorithms, are distinguished two types of errors:

  • Type I errors (FAR or FMR) - the probability that the system will miss the person that is not in the database ("incorrectly accepted");
  • Type II errors (FRR or FNMR) - the probability that the system will deny access to the person who is in the database ("wrongly denied").

At the time of writing, the minimum number of type II errors (FNMR) in the “Wild photo” category (shooting from almost any side, angles of rotation of the face to the camera up to 90 degrees, different lighting) is about 3% with a probability of type I error (FMR) - 0.001 %. When shooting under normal conditions (“Visa-Border” in NIST reports, an analogue of shooting at the airport, different lighting, slightly different angles, people of different nationalities are possible, are most close to the conditions during authentication) - 0.42% and 0.0001% respectively.

Algorithms suggest the possibility of customization, allowing to reduce the number of errors of one kind with the growth of errors of another kind depending on the business model (which is more critical - to give access to an unauthorized person or not to give access to an authorized one).

There is a question about the reliability of this authentication method. Everyone knows cases where the algorithm could be tricked with a user’s photo or recorded video (spoofing). Technologies are developing in this direction rapidly and at the moment there are many methods that make it possible to determine that there is a live person in front of the camera (liveness detection): analysis of the smallest details of the structure of the human face, tracking micro-movements of muscles, using an additional infrared camera, building a 3D model of the face, etc.

Much attention is paid to the possibility of authentication using video. Video is a set of frames (images), therefore, used the same recognition algorithm which is indicated at the beginning of the article. All additional verification methods can be classified as follows:

  1. Without user intervention. When the camera (possibly together with other cameras and sensors) reads all the necessary information;
  2. With the participation of the user. In this case, to confirm authentication, the user needs to follow certain instructions (smile, turn his head).

The second method is more reliable, since it is possible to give different instructions at different times, which makes it almost impossible to prepare another video from the side of the attacker. There is also no need for additional equipment (additional cameras and sensors), as well as its installation and configuration, which makes the system simpler and more affordable, without reducing its characteristics. To implement this authentication method, you need software that allows you to track the user's face throughout the process (to exclude the possibility of substituting photos).

For this purpose, you can use the correlation_tracker() class from the Dlib library. In this case, the program splits the video into separate frames for analysis. If a person’s face is on one of the frames, the location is transferred to the handler, which, when analyzing the next frames, will monitor the position of the object. It should be noted that face recognition will be carried out on each of the frames. Given the frame rate of the video, the ability to manipulate data is reduced to zero.

Biometric authentication is gaining momentum. Face recognition is rightfully considered one of the most promising areas, since for all the convenience of the user (no need to remember logins and passwords) it provides a very high degree of protection due to the possibility of introducing additional checks. As we can see from NIST reports, face recognition is becoming more accurate and in certain cases is superior in this area.

Authentication using a user's photo has disadvantages due to the limited set of data that can be obtained from a static image. Thus, video is one of the additional and most effective methods of confirming that the person in the video is who he claims to be.