Nowadays, facial recognition technology is becoming more widely used. Everyone knows examples of using this technology as a means of authentication or to search for certain people (criminals, for example). But we also encounter it in everyday life. When you upload photos, Facebook finds your friends in them and suggests tagging them. Or Google, in its photo service, finds people in your photos and can show a collection with a specific person.
The development of biometric authentication systems is a logical consequence of the development of technology and society. The login and password authentication system no longer meets modern reliability requirements and is inconvenient for the user (the need to remember or store the login and password).
The process can be divided into the following steps:
- Face detection in an image (Haar classifier, histogram of oriented gradients, convolutional neural network, etc.);
- Determination of key points on the face (landmarks);
- Search for unique facial features and record them in a form that is simple and understandable for the system (for example, constructing a vector with 128 dimensions (embedding));
- Comparison of the received information with the database.
The most popular Python libraries that allow you to perform these steps areOpenCVand Dlib.
If we talk about the accuracy of facial recognition systems, we can refer to data from the National Institute of Standards and Technology (NIST) under the US Department of Commerce. System testing is carried out by qualified specialists on closed data sets. When evaluating face recognition algorithms, two types of errors are distinguished:
- Type I errors (FAR or FMR) - the probability that the system will miss a person who is not in the database (“incorrectly admitted”);
- Type II errors (FRR or FNMR) are the likelihood that the system will deny access to a person who is in the database (“wrongly rejected”).
At the time of writing, the minimum number of type 2 errors (FNMR) in the “Wild Photos” category (shooting from almost any side, angles of rotation of the face to the camera up to 90 degrees, different lighting) is about 3% with the probability of a type 1 error (FMR) - 0.001 %. When shooting under normal conditions (“Visa-Border” in NIST reports, similar to shooting at an airport, different lighting, slightly different angles, possible people of different nationalities, closest to the conditions during authentication) - 0.42% and 0.0001% respectively .
Algorithms assume the ability to be configured to reduce the number of errors of one type while increasing errors of another type, depending on the business model (which is more critical - giving access to an unauthorized person or not giving access to an authorized one).
There is a question about the reliability of this authentication method. Everyone knows of cases where the algorithm could be deceived by a photograph or recorded video of a user (spoofing). Technologies are developing rapidly in this direction and at the moment there are many methods to determine that there is a living person in front of the camera (liveness detection): analysis of the smallest details of the structure of the human face, tracking muscle micromovements, using an additional infrared camera, building a 3D model of the face, etc. .d.
A lot of attention is being paid to the ability to authenticate using video. A video is a set of frames (images), so the same recognition algorithm is used, which is indicated at the beginning of the article. All additional verification methods can be classified as follows:
- Without user participation. When the camera (possibly together with other cameras and sensors) reads all the necessary information;
- With user participation. In this case, to confirm authentication, the user must follow certain instructions (smile, turn his head).
The second method is more reliable because different instructions can be given at different times, making it almost impossible for the attacker to prepare another video. There is also no need for additional equipment (additional cameras and sensors), as well as its installation and configuration, which makes the system simpler and more affordable without compromising its performance. To implement this authentication method, you need software that allows you to track the user's face throughout the entire process (to eliminate the possibility of photo substitution).
For this purpose, you can use the correlation_tracker()
class from the Dlib library. In this case, the program splits the video into separate frames for analysis. If one of the frames contains a person’s face, then the location is passed to the processor, which will track the position of the object when analyzing subsequent frames. It should be noted that face recognition will be carried out on each frame. Given the frame rate of the video, the possibility of data manipulation is reduced to zero.
Biometric authentication is gaining momentum. Facial recognition is rightly considered one of the most promising areas, since despite all the convenience for the user (no need to remember logins and passwords), it provides a very high degree of protection due to the possibility of introducing additional checks. As we can see from NIST reports, facial recognition is becoming more accurate and is outperforming in certain cases.
Authentication using a user's photo has disadvantages due to the limited data that can be obtained from a static image. Thus, video is one of the additional and most effective methods of confirming that the person in the video is who they say they are.
Also read: