CN110738607A

CN110738607A - Method, device and equipment for shooting driving license based on artificial intelligence and storage medium

Info

Publication number: CN110738607A
Application number: CN201910846126.3A
Authority: CN
Inventors: 肖志勇
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2019-09-09
Filing date: 2019-09-09
Publication date: 2020-01-31

Abstract

The application relates to the field of artificial intelligence, and provides methods, devices, equipment and storage media for shooting a driving license based on artificial intelligence, wherein the method comprises the steps of collecting a picture according to th configuration parameters corresponding to a driving license shooting mode, locating the position of a face in the picture, analyzing the face in the picture, judging whether the face meets the condition of th configuration parameters, cutting the face image according to the size set in th configuration parameters to obtain a face image with a preset size, adjusting the background color of the face image according to th configuration parameters, compressing the face image to a target driving license image with a preset resolution according to pixels set in th configuration parameters, jumping to a payment page, prompting a user to pay for the shooting service at the payment page, and generating and storing an electronic photo receipt of the driving license after the fact that the user finishes paying is detected.

Description

Method, device and equipment for shooting driving license based on artificial intelligence and storage medium

Technical Field

The application relates to the field of artificial intelligence, in particular to methods, devices, equipment and storage media for shooting a driving license based on artificial intelligence.

Background

At present, drivers 'licenses are shot manually by a photo studio, and after the photo studio obtains a receipt, the receipt and a photo are printed, the drivers' licenses are handled on site at a window of a traffic police handling department before working day, therefore, , the drivers 'license needs time for taking the photo, obtaining the receipt and printing the photo by the photo studio, and , the drivers' leave for leave, and carry the receipt and the printed photo to the traffic police handling department on site on working day.

Disclosure of Invention

The application provides methods, devices, equipment and storage media for shooting a driver license based on artificial intelligence, which can solve the problems of complex flow and low efficiency of shooting the driver license in the prior art.

, the present application provides methods of capturing driver's licenses based on artificial intelligence, the methods comprising:

detecting that a user logs in a traffic police system at a terminal, and after receiving an instruction that the user selects a driving license shooting mode at the terminal, starting a shooting device of the terminal according to the instruction, wherein the driving license shooting mode corresponds to groups of configuration parameters of image characteristics;

detecting an instruction of a user for starting the driving license shooting mode, and acquiring a picture shot by the shooting device according to -th configuration parameters corresponding to the driving license shooting mode;

according to a face recognition technology, recognizing a face in a picture shot by the camera, and displaying a positioning frame in the picture to position the position of the face in the picture; the positioning frame is used for positioning a detection object matched with a preset shooting object in the picture;

analyzing the face in the picture, and judging whether the face in the picture meets the condition of the th configuration parameter;

cutting the face image according to the size set in the th configuration parameter to obtain a face image with a preset size, adjusting the background color of the face image according to the background color set in the th configuration parameter to enable the face image with the preset size to meet the background color requirement, and compressing the face image with the preset size according to the pixels set in the th configuration parameter to obtain a target driver license image with a preset resolution;

after a target driving license image is generated, jumping to a payment page, and prompting a user to pay for the shooting service on the payment page;

and generating and storing the electronic receipt of the driver license photo after the payment of the user is detected to be completed.

possible designs, wherein after the recognizing the face in the picture captured by the camera and before the analyzing the face in the picture, the method further comprises:

carrying out edge processing on the image in the picture shot by the camera by utilizing a canny edge detection algorithm to obtain an edge binary image;

counting the number of edge points and non-edge points in the edge binary image, and calculating the ratio of the number of the edge points to the sum of the number of the edge points and the number of the non-edge points;

if the calculated ratio is larger than a set threshold value, determining that an image in a picture shot by the camera is a clear human face image;

processing the face image, and positioning face key points to obtain key point data;

and extracting statistical characteristics according to the key points of the face, dividing the positioned face image into a plurality of characteristic regions, and detecting an eye region, an eyebrow region, a nose region, a mouth region and an ear region from each characteristic region according to a preset image detection algorithm.

after the position of the face in the picture is located, adopting a face detection frame of a Multi-task description operator to detect face characteristic points of the face in the picture, comparing the face in the picture with a preset face image, and if the face in the picture is determined to be incomplete through comparison, detecting the coordinates of each pixel point on the face outline in the picture and the missing part of the face;

calculating the coordinate difference between the coordinates of each pixel point on the face contour in the picture and the picture;

determining a moving direction according to the coordinate difference between the coordinates of the missing part of the human face and the coordinates of each pixel point on the contour of the human face in the picture and the picture, wherein the moving direction comprises a direction in which a terminal is to move or a direction in which the human face is to move;

sending th prompt information indicating the moving direction, wherein the th prompt information is used for indicating the user to move the face in the picture until the face in the picture is displayed completely, or the th prompt information is used for indicating the user to move relative to the terminal until the face in the picture is displayed completely;

and detecting the face in the picture in real time, matching the face detected in real time with a preset face image, and sending out second prompt information after detecting that the face in the picture is completely displayed.

possible designs, wherein the detecting human face feature points of the human face in the picture by the human face detection framework using Multi-task descriptor comprises:

classifying the faces in the picture by adopting the following formula:

wherein the content of the first and second substances,

a cross entropy loss function for face classification, pi is the probability of being a face,

a real label as a background;

the bounding box regression is implemented using the following formula:

wherein the content of the first and second substances,

is the regression loss calculated by the euclidean distance;the result of the prediction by the network is,

actual real background coordinates;

the coordinates of all pixel points on the face contour in the picture and the face missing part are detected by adopting the following formula:

wherein the content of the first and second substances,for the regression loss calculated by the euclidean distance,

the Euclidean distance calculation method is used for calculating the predicted pixel point coordinate position and the Euclidean distance of the actual pixel point coordinate, and minimizing the Euclidean distance;

for pre-measuring through the networkIn the end of the above-mentioned process,

is the actual real landmark coordinates.

possible designs, after analyzing the face in the picture, the method further comprises:

and detecting the face in the picture by a face recognition technology, and recognizing and positioning key features of the upper face in the face. The key features of the face comprise corresponding feature points of eyes, cheekbones, a nose, a mouth, a chin and the like;

calculating the intervals among the characteristic points and the proportional relation among the intervals;

judging whether the face angle meets the face righting condition or not according to the space between the feature points and the proportional relation between the spaces;

if the distances among the characteristic points and the proportional relation among the distances are determined to be in a normal range, the human face is considered to be in accordance with the face-righting condition; and if the proportional relation between the distances among part or all of the feature points and the distances is determined not to be in the normal range, determining that the face angle does not accord with the face righting condition.

possible designs, after analyzing the human face in the picture, the method further includes:

judging whether wearing articles and/or facial texture features exist in the human face, wherein the wearing articles comprise at least items in hats, glasses and dresses, and the facial texture features refer to beauty marks;

if the human face is judged to contain the wearing articles, judging whether the wearing articles meet the configuration conditions of the wearing articles defined in the configuration parameters according to the configuration parameters;

if the makeup trace is judged to exist in the face, judging whether the makeup trace meets the configuration condition of the facial texture feature defined in the configuration parameter according to the configuration parameter;

, if the driving license shooting mode is driving license and the wearing article is recognized to include a light-colored jacket, the configuration condition of the wearing article is considered not to be met, and after the configuration condition of the wearing article is considered not to be met, of:

the method comprises the steps of sending a fifth prompt for prompting a user to change dark-colored clothes, starting timing from the sending of the fifth prompt information, detecting a picture in the picture in real time, identifying the color of a jacket in the picture by adopting an image identification technology, judging whether the color of the jacket accords with configuration parameters or not, and ending the shooting operation if the timing duration exceeds a preset duration and the color of the jacket in the picture does not accord with configuration parameters;

or displaying a virtual clothes icon in the area where the picture is located; sending a sixth prompt, wherein the sixth prompt is used for prompting a user to change the virtual clothes; detecting input of a user aiming at the virtual clothes icon, and overlaying the virtual clothes selected by the input into a picture of the picture to enable the virtual clothes to be worn at a specified position in the picture; and when the virtual clothes are detected to be worn to the designated position, determining that the face image in the current picture conforms to the configuration condition of the wearing article.

In a second aspect, the present application provides apparatus for self-service driving license photographing, having functions corresponding to the method for photographing driving license based on artificial intelligence provided in the above .

, the device for self-service driving license shooting comprises:

the detection module is used for detecting that a user logs in the traffic police system at the terminal;

the receiving and sending module is used for receiving an instruction of selecting a driving license shooting mode at the terminal by a user, and the driving license shooting mode corresponds to groups of configuration parameters of image characteristics;

the processing module is used for starting a shooting device of the terminal according to the instruction detected by the transceiving module, detecting an instruction of a user for starting a driving license shooting mode through the detection module, collecting a picture shot by the shooting device according to th configuration parameters corresponding to the driving license shooting mode, identifying a face in the picture shot by the camera according to a face recognition technology, displaying a positioning frame in the picture to position the position of the face in the picture, wherein the positioning frame is used for positioning a detection object matched with a preset shooting object in the picture, analyzing the face in the picture, judging whether the face in the picture meets the condition of th configuration parameters, cutting the face image according to the size set in the configuration parameters to obtain a face image with a preset size, adjusting the background color of the face image according to the background color set in the configuration parameters to ensure that the face image with the preset size meets the background color requirement, performing compression processing on the face image with the preset size according to pixels set in the configuration parameters, paying the size to obtain a driving license resolution, generating a driving license page, and performing electronic return prompting on the user after the driving license is detected.

, in one possible design, after recognizing the face in the picture captured by the camera, and before analyzing the face in the picture, the processing module is further configured to:

after the detection module locates the position of the face in the picture, adopting a face detection frame of a Multi-task description operator to detect face characteristic points of the face in the picture, comparing the face in the picture with a preset face image, and if the face in the picture is determined to be incomplete through comparison, detecting the coordinates of each pixel point on the face outline in the picture and the face missing part;

sending th prompt information indicating the moving direction through the transceiving module, wherein the th prompt information is used for indicating the user to move the face in the picture until the face in the picture is completely displayed, or the th prompt information is used for indicating the user to move relative to the terminal until the face in the picture is completely displayed;

and detecting the face in the picture in real time through the detection module, matching the face detected in real time with a preset face image, and sending second prompt information through the transceiving module after the detection module detects that the face in the picture is completely displayed.

possible designs, the detection module is specifically configured to:

classifying the faces in the picture by adopting the following formula:

wherein the content of the first and second substances,

a real label as a background;

the bounding box regression is implemented using the following formula:

wherein the content of the first and second substances,

is the regression loss calculated by the euclidean distance;

the result of the prediction by the network is,

actual real background coordinates;

wherein the content of the first and second substances,

for the regression loss calculated by the euclidean distance,

in order to be predicted by the network,

is the actual real landmark coordinates.

possible designs, the processing module is specifically configured to:

the method comprises the steps of sending a fifth prompt through a receiving and sending module, wherein the fifth prompt is used for prompting a user to change dark clothes, starting timing from the sending of the fifth prompt information, detecting a picture in the picture in real time, identifying the color of a jacket in the picture by adopting an image identification technology, and judging whether the color of the jacket accords with configuration parameters or not, and if the timing duration exceeds a preset duration, the detection module does not detect that the color of the jacket in the picture does not accord with configuration parameters, ending the shooting operation;

or displaying a virtual clothes icon in the area where the picture is located; sending a sixth prompt through the transceiver module, wherein the sixth prompt is used for prompting a user to change the virtual clothes; detecting input of a user for the virtual clothes icon through the detection module, and overlaying the virtual clothes selected by the input into a picture of the picture so that the virtual clothes are worn at a specified position in the picture; and when the virtual clothes are detected to be worn to the designated position, determining that the face image in the current picture conforms to the configuration condition of the wearing article.

A further aspect of the application provides computer devices comprising at least connected processors, memories and transceivers, wherein the memories are configured to store program code and the processors are configured to invoke the program code in the memories to perform the methods of the aspect.

Yet another aspect of the present application provides computer storage media comprising instructions that, when executed on a computer, cause the computer to perform the method of aspect above.

Compared with the prior art, according to the embodiment of the application, after the shooting device is started, the corresponding configuration parameter is called according to the selected driving license shooting mode, the human face in the configuration parameter picture is used for correcting and adjusting, and the human face image shot by the camera is processed to obtain the driving license photo.

Drawings

FIG. 1 is a schematic flow chart of methods for shooting a driver's license based on artificial intelligence in the embodiment of the present application;

FIG. 2 is a schematic diagram of interfaces displaying an th prompting message on a screen in the embodiment of the present application;

FIG. 3 is a diagram illustrating interfaces displaying a second prompt message on a screen according to an embodiment of the disclosure;

FIG. 4 is a schematic view of kinds of structures of the device for self-help shooting a driver's license in the embodiment of the present application;

fig. 5 is a schematic view of kinds of structures of a computer device in the embodiment of the present application.

The objectives, features, and advantages of the present application will be further explained in connection with the embodiments, as illustrated in the accompanying drawings at .

Detailed Description

It should be understood that the terms "," "second," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The application provides methods, devices, equipment and storage media for shooting a driver license based on artificial intelligence, which can be used for shooting the driver license by self based on artificial intelligence.

In order to solve the technical problems, the application mainly provides the following technical scheme:

an AI image recognition and image processing technology is introduced into a traffic police system, so that an interface for self-service handling of personal driving license replenishment and exchange services on line is provided for a user. Specifically, a user opens a camera of the terminal, a traffic police system collects pictures shot by the camera in real time, the situations of facial expression, makeup, hat wearing, glasses wearing, face angle, dressing color and the like are automatically recognized in the shooting process, whether the shot pictures meet requirements of driver's license pictures or not is automatically prompted, no special requirements are provided for external scenes such as a shooting site and a shooting background, the driver's license pictures are automatically generated after the shooting is finished, the shooting effect of the pictures is immediately checked, the pictures can be shot again under the unsatisfactory situation, the pictures can be stored in local equipment after the shooting is finished, and the pictures are backed up and simultaneously supported to be stored in a private cloud space. The operation on electronic terminals such as mobile phones and computers is supported.

Referring to fig. 1, methods for shooting a driving license based on artificial intelligence in the embodiment of the present application are described below, the methods including:

101. and after detecting that a user logs in the traffic police system at the terminal and receiving an instruction that the user selects a driving license shooting mode at the terminal, starting a shooting device of the terminal according to the instruction.

The method comprises the steps of starting a shooting device of the terminal, and displaying a driving license shooting mode in a function menu, wherein the driving license shooting mode is a shooting mode for shooting driving licenses, and the driving license shooting mode corresponds to sets of configuration parameters of image features, and the configuration parameters are configured in advance in a driving license shooting mode system of the terminal.

The configuration parameters at least include types of parameters including size, background color, pixel, exposure, configuration condition of wearing object, configuration condition of facial texture feature and configuration condition of facial expression, wherein the size indicates the size of the driver's license photo, such as the driver's license photo is 32mm x 22mm, the background color indicates the background color of the driver's license photo, such as the background color of the identification license photo is blue, the pixel indicates the resolution of the driver's license photo, the wearing object indicates special requirements of the driver's license photo for the wearing object, such as inability to wear hat, inability to give off head, inability to wear glasses, inability to make up, inability to hide eyes or light-colored clothes, or requirement of full exposure of five sense organs, such as eye exposure, ears exposure, etc.

102. And detecting an instruction of starting the driving license shooting mode by a user, and acquiring a picture shot by the shooting device according to th configuration parameters corresponding to the driving license shooting mode.

The characteristics of the configuration parameters may refer to the explanation of the configuration parameters in step 101, and may be a subset or a complete set of the configuration parameters, and the present application does not limit the parameter items included in the -th configuration parameters, and the similar points are not repeated.

in some embodiments, the picture is a dynamic shot video, and no shot will be taken on the picture in the picture until no shooting instruction from the user is detected.

103. According to the face recognition technology, the face in the picture shot by the camera is recognized, and a positioning frame is displayed in the picture so as to position the position of the face in the picture.

The positioning frame is used for positioning a detection object matched with a preset shooting object in the picture. For example, the driving license shooting mode is used for shooting a human face, and if it is detected that the detection object in the picture is a dog face, the positioning frame is not displayed; and if the detected object in the picture is detected to be a human face, displaying the positioning frame.

in some embodiments, after detecting the detection object in the picture is a face, edge processing can be performed on the image in the picture shot by the camera by using canny edge detection algorithm to obtain an edge binary image, the number of edge points and non-edge points in the edge binary image is counted, the ratio of the number of edge points to the sum of the number of edge points and non-edge points is calculated, if the calculated ratio is greater than a set threshold, the image in the picture shot by the camera is determined to be a clear face image, series data processing is performed on the face image, key point positioning is performed to obtain key point data, the positioned face image is divided into a plurality of feature areas according to the extracted statistical features of the face key points, and an eye area, an eyebrow area, a nose area, a mouth area and an ear area are detected from each feature area according to a preset image detection algorithm, wherein the image detection algorithm includes R-CNN, Fast R-CNN, etc.

104. And analyzing the face in the picture, and judging whether the face in the picture meets the condition of the th configuration parameter.

For example, the th configuration parameters include a size, a background color, pixels, a configuration condition of a wearing object and a configuration condition of a facial texture feature, then, the size of the driver license photo to be shot at this time is 32mm × 22mm, the background color of the driver license photo is white, the pixels of the driver license photo are 165 to 189 pixels wide in the head, the length of the head is 224 to 260 pixels, the head hair line is 10 to 20 pixels away from the edge of the photo, the shot driver license photo cannot wear a hat, cannot wear heads and hair, cannot wear glasses, cannot make up, cannot hide eyes or cannot wear light-colored clothes, and the like, and according to analysis and judgment, at least items of the size, the background color, the pixels and the wearing object in the picture do not accord with corresponding parameter values, and then, the step 105 can be performed for corresponding adjustment.

in some embodiments, considering that it may not be possible to capture qualified faces, it is also possible to filter and prompt faces in the picture before the formal start of capturing or generating a driver's license photo.

(1) After the position of the face in the picture is located, adopting a face detection frame of a Multi-task description operator to detect face characteristic points of the face in the picture, comparing the face in the picture with a preset face image, and if the face in the picture is determined to be incomplete through comparison, detecting the coordinates of each pixel point on the face outline in the picture and the face missing part.

in some embodiments, the Multi-task descriptor comprises a face classifier, a bounding box regression, and a coordinate location.

Adopting the following formula to classify the face in the picture, namely adopting the following formula by the face classifier to classify the face in the picture:

wherein the content of the first and second substances,

a real label as a background;

the bounding box regression is implemented using the following formula:

wherein the content of the first and second substances,

is the regression loss calculated by the euclidean distance;

the result of the prediction by the network is,

actual real background coordinates;

wherein the content of the first and second substances,

for the regression loss calculated by the euclidean distance,

in order to be predicted by the network,

is the actual real landmark coordinates.

Since are 5 points in total, each point is 2 coordinates, so y belongs to a ten-tuple.

(2) And calculating the coordinate difference between the coordinates of each pixel point on the face contour in the picture and the picture.

(3) And determining the moving direction according to the coordinate difference between the coordinates of the missing part of the human face and each pixel point on the contour of the human face in the picture and the picture.

The moving direction comprises a direction in which the terminal is to be moved or a direction in which the face is to be moved.

(4) An th prompt message indicating the direction of movement is issued.

The th prompt message is used for indicating the user to move the face in the picture until the face in the picture is displayed completely, or the th prompt message is used for indicating the user to move relative to the terminal until the face in the picture is displayed completely.

in some embodiments, the prompt message can be a notification message, a voice prompt or a horizontal arrow, the notification message can float in the screen and can be in an icon mode, a bullet frame mode or a bullet screen mode, the voice prompt can be "please raise the head", "please move left", "please move right" or "please open eyes", etc.

(5) And detecting the face in the picture in real time, matching the face detected in real time with a preset face image, and sending out second prompt information after detecting that the face in the picture is completely displayed.

The preset face image comprises human five sense organs, hair, neck and clothes. The size, background color, pixels, configuration conditions of the wearing articles and facial texture features of the preset human face image all accord with the values set by the configuration parameters.

Similarly, the second prompt may be a notification message, a voice prompt, or an alert icon, the notification message may float on the screen and may be in the form of an icon, a pop-up box, or a pop-up screen, the voice prompt may be "please stop moving" or "OK", etc., the alert icon may be a static or dynamic exclamation point, such as an icon like "!" or "√" or some other icon.

in some embodiments, the exposure level of the frame in the frame is detected, and if the exposure level does not meet the requirement of the driver's license photo (e.g. under-exposure or over-exposure), a third prompt message is sent or the exposure level of the frame is adaptively adjusted until the exposure level of the frame in the frame meets the condition of the driver's license photo.

105. The method comprises the steps of cutting the face image according to the size set in the th configuration parameter to obtain a face image with a preset size, adjusting the background color of the face image according to the background color set in the th configuration parameter to enable the face image with the preset size to meet the requirement of the background color, and compressing the face image with the preset size according to the pixels set in the th configuration parameter to obtain a target driver license image with a preset resolution.

in some embodiments, the standardized processing such as background color adjustment and image compression can be performed on the face image according to the configuration parameters to obtain a driver license photo, so that the driver license photo finally conforms to the standard format of the driver license photo and the expectations of the user.

For example, the configuration parameters include a size, a background color, pixels, a configuration condition of a wearing object and a configuration condition of a face texture feature, then, the size of the driver license photo to be shot at this time is 32mm 22mm, the background color of the driver license photo is white, the pixels of the driver license photo are 165 to 189 pixels of head width, the head length is 224 to 260 pixels, the head hair trace is 10 to 20 pixels from the upper edge of the photo, the shot driver license photo cannot wear a hat, cannot shed the head, cannot wear glasses, cannot make up, cannot hide eyes or cannot wear light-colored clothes, and the like, the face image in the picture is cut to 32mm 22mm, the background color is adjusted to white, the pixels are adjusted to 170 pixels of head width, the head length is adjusted to 250 pixels, the head hair trace is adjusted to 15 pixels from the upper edge of the face image according to the parameter items of the configuration parameters, and the driver license is finally obtained.

106. And after the target driving license image is generated, jumping to a payment page, prompting the user to pay for the shooting service on the payment page, and generating and storing an electronic receipt of the driving license photo after the user finishes payment.

The electronic receipt is an image file in a TIF format, the generated electronic receipt shows that the shot driver license photo is proved and qualified in a national-specified demonstration website, and meanwhile, the driver license photo is also stored in a database of a traffic system, and the driver license photo can be extracted by sending the image number on the electronic receipt to the database of a traffic police system when handling the driver license.

Optionally, in , in order to further increase the success rate and the qualification rate of generating driver's license photos, it may also be determined whether the face in the picture meets the face-facing condition.

(1) And detecting the face in the picture by a face recognition technology, and recognizing and positioning key features of the upper face in the face. The key features of the face comprise corresponding feature points of eyes, cheekbones, a nose, a mouth, a chin and the like.

And if all the characteristic points cannot be identified, the human face is considered not to accord with the face-righting condition.

(2) And calculating the intervals among the characteristic points and the proportional relation among the intervals.

(3) And judging whether the face angle meets the face righting condition or not according to the space between the characteristic points and the proportional relation between the spaces.

For example, if eyes are recognized, the face is considered to be a side face and not to be a front face condition, and for example, if eyes are recognized and not in the normal range, at least eyes are closed, or the face is not facing the camera, such as looking down or looking up, the face is considered to be a side face and not to be a front face condition, and for example, if two eyes are recognized and eyes are in a side face state, the face is considered to be a side face and not to be a front face condition.

For example, 2D data or 3D data of the face in the picture is identified, and if the face is too deviated from the 2D data or 3D data to display a full-angle face (for example, a non-frontal face), a fourth prompt message is sent, where the fourth prompt message is used to prompt to adjust the angle of the face in the picture. For example, the fourth prompting message is "turn right" or "turn left", etc.

in some embodiments, for example, considering that each shooting mode will set different additional conditions for the driver license photo, the additional conditions can be used to determine whether the face in the picture meets the conditions of the driver license photo in the driver license shooting mode.

(1) The method comprises the steps of judging whether wearing articles and/or facial texture features exist in the human face, wherein the wearing articles comprise hats, glasses and dresses. The facial texture feature is a makeup trace such as eyebrow drawing, lipstick painting, eye shadow drawing, false eyelash wearing, or pupil wearing.

For example, taking the detection of the wearing object as an example, a feature extraction algorithm, such as Histogram of Oriented Gradient (HOG), is used to detect the feature objects in the area around the face or on the face, and the following description is provided:

firstly, graying a face image in a picture, and then standardizing the grayed face image by adopting a Gamma correction method (namely, making the grayed face image into ) so as to adjust the contrast of the face image, reduce the influence caused by local shadow and illumination change of the face image and inhibit the interference of noise;

the gradient (including magnitude and direction) of each pixel in the face image is computed to capture contour information.

Dividing the face image into small cells, such as 6 × 6 pixels/cell;

making blocks (such as 3 × 3 cells/block) by each plurality of cells, and connecting the feature descriptors of all the cells in blocks in series to obtain the HOG feature descriptor of the block;

and connecting HOG feature descriptors of all blocks in the face image in series to obtain the HOG feature descriptors (namely the final feature vectors for classified use) of the face image (namely the detection object).

Analyzing whether the face key feature points are missing in the face image according to the HOG feature descriptor,

for example, taking the detection of facial texture features as an example, a feature extraction algorithm is used to detect a feature object on a human face, and the following is introduced:

detecting local texture features of each face key point in the face image;

the method comprises the steps of reducing dimensions of local texture features of all facial key points respectively to obtain low-dimensional texture features, calculating similarity between the low-dimensional texture features and five sense organs texture features in an identity card picture based on sparse texture reconstruction, if the similarity between certain low-dimensional texture features and corresponding five sense organs texture features in the identity card picture is calculated to be higher than , determining that the current facial texture features of a real-time user meet configuration conditions of shooting parameters, and if the similarity between certain low-dimensional texture features and corresponding five sense organs texture features in the identity card picture is calculated to be lower than a second similarity, determining that the current facial texture features of the real-time user do not meet the configuration conditions of the shooting parameters.

(2) If the human face contains the wearing articles, judging whether the wearing articles meet the configuration conditions of the wearing articles defined in the configuration parameters according to the configuration parameters, and if so, judging whether cosmetic traces meet the configuration conditions of facial texture features defined in the configuration parameters according to the configuration parameters.

For example, if the driving license shooting mode is an identity card, and the human face is identified to contain glasses, the configuration condition of the wearing object is considered not to be met; if the driving license shooting mode is a driving license photo, and the fact that the face of the person contains the glasses is recognized, the person is considered to be in accordance with the configuration conditions of the wearing object. Because the driving license photo requires that the myopia person must wear the glasses, the configuration parameters corresponding to the driving license photo are different from the configuration parameters corresponding to the identity card photo.

For another example, if the driving license shooting mode is an identity card, and the face is identified to contain makeup traces, the configuration condition of the facial texture features is considered not to be met; and if the driving license shooting mode is a social security card, identifying that the face contains a makeup trace lower than a makeup level, and determining that the face meets the configuration condition of the facial texture features. Since the driving license photo requires that the myopia person must wear the glasses, the configuration parameters corresponding to the identity card photo are different from the configuration parameters corresponding to the social security card photo.

Optionally, in , if the driving license photographing mode is driving photograph, and it is recognized that the wearing article includes a light-colored coat, the configuration condition of the wearing article is not met, after the configuration condition of the wearing article is not met, of the following process may be further performed:

(1) sending a fifth prompt to prompt the user to change dark clothes;

starting timing from sending the fifth prompt message, and detecting the picture in real time;

and if the color of the jacket in the picture is not detected to be not accordant with the th configuration parameter after the timing duration exceeds the preset duration, ending the shooting operation.

After the timing is started, the color of the jacket in the picture is identified by adopting an image identification technology, and whether the color of the jacket accords with the configuration parameter is judged.

(2) Displaying a virtual clothes icon in the area where the picture is located;

sending a sixth prompt to prompt the user to change the virtual clothes;

detecting input of a user aiming at the virtual clothes icon, and overlaying the virtual clothes selected by the input into a picture of the picture so that the virtual clothes is worn at a specified position (such as a face image of the user) in the picture;

and when the virtual clothes are detected to be worn to the designated position, determining that the face image in the current picture conforms to the configuration condition of the wearing article.

Technical features mentioned in any embodiment or implementation corresponding to in fig. 1 to 3 are also applicable to the embodiments corresponding to fig. 4 and 5 in the present application, and similar parts will not be described again.

methods for shooting the driving license based on artificial intelligence in the application are explained above, and a device for executing the method for shooting the driving license based on artificial intelligence is described below.

The device 40 in the embodiment of the present application can implement steps corresponding to the method for capturing drivers licenses based on artificial intelligence performed in the embodiment corresponding to fig. 1, the functions implemented by the device 40 can be implemented by hardware, or implemented by hardware executing corresponding software, the hardware or software includes or a plurality of modules corresponding to the above functions, and the modules can be software and/or hardware, the device 40 can include a detection module 401, a transceiver module 402, and a processing module 403, the functions implemented by the detection module 401, the transceiver module 402, and the processing module 403 can refer to the operations performed in the embodiment corresponding to fig. 1, and no further description is made herein, the processing module 403 can be used to control the transceiver operation of the transceiver module 402 and control the detection operation of the detection module 401.

in some embodiments, the detection module 401 can be used to detect that a user logs in the traffic police system at a terminal;

the transceiver module 402 may be configured to receive an instruction that a user selects a driving license shooting mode at the terminal, where the driving license shooting mode corresponds to configuration parameters of sets of image features;

the processing module 403 may be configured to start a camera of the terminal according to the instruction detected by the transceiver module 402, detect an instruction of a user to start the driving license shooting mode through the detection module, acquire a picture shot by the camera according to th configuration parameters corresponding to the driving license shooting mode, recognize a face in the picture shot by the camera according to a face recognition technology, and display a positioning frame in the picture to position a position of the face in the picture, wherein the positioning frame is used for positioning a detection object matched with a preset shooting object in the picture, analyze the face in the picture, determine whether the face in the picture meets the condition of th configuration parameters, perform clipping processing on the face image according to a size set in configuration parameters to obtain a face image of a preset size, adjust a background color of the face image according to a background color set in configuration parameters to make the face image of the preset size meet the requirement of the background color, perform processing on the face image of the preset size according to a pixel set in configuration parameters, perform payment processing on the face image of the preset size, perform payment on the face image of the driving license, perform payment processing, perform payment on the face image of the driving license, perform payment of the compressed page, and perform the electronic payment of the user, and perform the electronic return of the user after the user, and generate the user, and perform the user prompt on the user to generate the user.

In the embodiment of the application, after the shooting device is started, the processing module 403 calls the corresponding configuration parameter according to the selected driving license shooting mode, the human face in the configuration parameter picture is used for correction and adjustment, and the human face image shot by the camera is processed to obtain the driving license photo.

, in some embodiments, the processing module 403, after recognizing the face in the picture captured by the camera and before analyzing the face in the picture, is further configured to:

after the detection module 401 locates the position of the face in the picture, a face detection framework of a Multi-task description operator is adopted to detect the face characteristic points of the face in the picture, the face in the picture is compared with a preset face image, and if the face in the picture is determined to be incomplete through comparison, the coordinates of each pixel point on the face contour in the picture and the face missing part are detected;

sending th prompt information indicating the moving direction through the transceiving module 402, wherein the th prompt information is used for indicating that the user moves the face in the picture until the face in the picture is completely displayed, or the th prompt information is used for indicating that the user moves relative to the terminal until the face in the picture is completely displayed;

the face in the picture is detected in real time through the detection module 402, the face detected in real time is matched with a preset face image, and when the detection module detects that the face in the picture is completely displayed, second prompt information is sent out through the receiving and sending module.

in some embodiments, the detection module 401 is specifically configured to:

classifying the faces in the picture by adopting the following formula:

wherein the content of the first and second substances,a cross entropy loss function for face classification, pi is the probability of being a face,

a real label as a background;

the bounding box regression is implemented using the following formula:

wherein the content of the first and second substances,

is the regression loss calculated by the euclidean distance;

the result of the prediction by the network is,

actual real background coordinates;

wherein the content of the first and second substances,

for the regression loss calculated by the euclidean distance,

the Euclidean distance calculation method is used for calculating the predicted pixel point coordinate position and the Euclidean distance of the actual pixel point coordinate, and minimizing the Euclidean distance;in order to be predicted by the network,

is the actual real landmark coordinates.

, in some embodiments, the processing module 403 is specifically configured to:

in some embodiments, if the driving license shooting mode is driving license and the wearing object is identified to include a light-colored jacket, the configuration condition of the wearing object is considered not to be met, and after the configuration condition of the wearing object is considered not to be met, of:

sending a fifth prompt through the transceiver module 402, wherein the fifth prompt is used for prompting a user to change dark-colored clothes, starting timing from sending the fifth prompt information, detecting the picture in real time, identifying the color of the jacket in the picture by adopting an image identification technology, and judging whether the color of the jacket accords with the configuration parameter or not, and if the timing duration exceeds the preset duration, the detection module does not detect that the color of the jacket in the picture does not accord with the configuration parameter, ending the shooting operation;

or displaying a virtual clothes icon in the area where the picture is located; sending a sixth prompt through the transceiver module 402, the sixth prompt being used for prompting a user to change the virtual clothes; detecting an input of the user for the virtual clothes icon through the detection module 401, and overlaying the virtual clothes selected by the input into the picture of the picture so that the virtual clothes are worn at a specified position in the picture; and when the virtual clothes are detected to be worn to the designated position, determining that the face image in the current picture conforms to the configuration condition of the wearing article.

The apparatus 40 for self-help driving license shooting in the embodiment of the present application is described above from the perspective of a modular functional entity, and computer devices are described below from the perspective of hardware, as shown in fig. 5, and each computer device includes a processor, a memory, a transceiver (which may also be an input-output unit, not identified in fig. 5), and a computer program stored in the memory and running on the processor.

The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Array (FPGA) or other Programmable logic device, discrete or transistor logic device, discrete hardware components, etc. the general purpose Processor may be a microprocessor or the Processor may be any conventional Processor, etc. the Processor is the control center of the computer device, and various interfaces and lines are used to connect various parts of the whole computer device.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs (such as a sound playing function, an image playing function, etc.) required for at least functions, and the like, and the storage data area may store data (such as audio data, video data, etc.) created according to the use of the mobile phone, and the like, and furthermore, the memory may include a high-speed random access memory, and may further include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart memory Card (Smart Media Card, SMC), a Secure Digital (SD) Card, a Flash memory Card (Card), at least magnetic disk storage devices, Flash memory devices, or other volatile solid state storage devices.

The transceivers may also be replaced by receivers and transmitters, which may be the same or different physical entities. When the same physical entity, may be collectively referred to as a transceiver. The transceiver may be an input-output unit.

The memory may be integrated in the processor or may be provided separately from the processor.

Based on the understanding that the technical solution of the present application or a part contributing to the prior art can be embodied in the form of a software product, which is stored in storage media (such as ROM/RAM), and includes several instructions for causing terminals (which may be mobile phones, computers, servers, network devices, etc.) to execute the methods described in the embodiments of the present application.

The embodiments of the present application have been described above with reference to the drawings, but the present application is not limited to the above-mentioned embodiments, which are only illustrative and not restrictive, and those skilled in the art can make many changes and modifications without departing from the spirit and scope of the present application and the protection scope of the claims, and all changes and modifications that come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims

1, method for shooting driving license based on artificial intelligence, which is characterized in that the method comprises:

2. The method of claim 1, wherein after the recognizing the face in the picture captured by the camera and before the analyzing the face in the picture, the method further comprises:

3. The method of claim 1, wherein after the recognizing the face in the picture captured by the camera and before the analyzing the face in the picture, the method further comprises:

after the position of the face in the picture is located, adopting a face detection frame of a Multi-task description operator to detect face characteristic points of the face in the picture, comparing the face in the picture with a preset face image, and if the face in the picture is determined to be incomplete through comparison, detecting the coordinates of each pixel point on the face outline in the picture and the face missing part;

4. The method of claim 3, wherein the face detection framework using a Multi-task descriptor for face feature point detection on the face in the picture comprises:

classifying the faces in the picture by adopting the following formula:

wherein the content of the first and second substances,

a real label as a background;

the bounding box regression is implemented using the following formula:

wherein the content of the first and second substances,

is the regression loss calculated by the euclidean distance;

the result of the prediction by the network is,

actual real background coordinates;

wherein the content of the first and second substances,

for the regression loss calculated by the euclidean distance,

in order to be predicted by the network,

is the actual real landmark coordinates.

5. The method according to claim 3 or 4, wherein after the analyzing the face in the picture, the method further comprises:

6. The method according to claim 3 or 4, wherein after the analyzing the face in the picture, the method further comprises:

7. the method of claim 6, wherein if the driving license photographing mode is driving license and it is recognized that the wearing article includes a light-colored coat, the configuration condition of the wearing article is considered not to be met, and after the configuration condition of the wearing article is considered not to be met, of the following operations is further performed:

A device for self-service capture of a driver's license, the device comprising:

the processing module is used for starting a shooting device of the terminal according to the instruction detected by the transceiving module, detecting an instruction of a user for starting a driving license shooting mode through the detection module, collecting a picture shot by the shooting device according to configuration parameters corresponding to the driving license shooting mode, identifying a face in the picture shot by the camera according to a face recognition technology, displaying a positioning frame in the picture to position the position of the face in the picture, wherein the positioning frame is used for positioning a detection object matched with a preset shooting object in the picture, analyzing the face in the picture, judging whether the face in the picture meets the condition of configuration parameters, cutting the face image according to the size set in configuration parameters to obtain a face image with a preset size, adjusting the background color of the face image according to the background color set in configuration parameters to ensure that the background image with the preset size meets the requirement of the background color, performing the resolution processing on the picture with the preset size according to pixels set in configuration parameters, performing the electronic payment processing on the face image with the preset size, generating a shooting target payment service, and generating a skip page after the electronic payment of the driving license is performed on the face image with the preset size, and the user.

A computer device of the type 9, , wherein the device comprises:

at least processors, memory, and transceivers;

wherein the memory is configured to store program code and the processor is configured to invoke the program code stored in the memory to perform the method of any of claims 1-7, .

Computer storage medium , characterized in that it comprises instructions which, when run on a computer, cause the computer to perform the method according to any of claims 1-7, .