CN111476225A - In-vehicle human face identification method, device, equipment and medium based on artificial intelligence - Google Patents

In-vehicle human face identification method, device, equipment and medium based on artificial intelligence Download PDF

Info

Publication number
CN111476225A
CN111476225A CN202010593969.XA CN202010593969A CN111476225A CN 111476225 A CN111476225 A CN 111476225A CN 202010593969 A CN202010593969 A CN 202010593969A CN 111476225 A CN111476225 A CN 111476225A
Authority
CN
China
Prior art keywords
anchor box
face recognition
feature map
image
outputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010593969.XA
Other languages
Chinese (zh)
Other versions
CN111476225B (en
Inventor
吴晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010593969.XA priority Critical patent/CN111476225B/en
Publication of CN111476225A publication Critical patent/CN111476225A/en
Application granted granted Critical
Publication of CN111476225B publication Critical patent/CN111476225B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to the technical field of artificial intelligence and human face recognition, and provides an in-vehicle human face recognition method, an in-vehicle human face recognition device, in-vehicle human face recognition equipment and an in-vehicle human face recognition medium based on artificial intelligence, wherein a channel attention mechanism can be introduced to further screen a learned feature map, so that the feature redundancy problem is relieved, attention is focused on a specified area, the accuracy of in-vehicle human face recognition is effectively improved, the problem of insufficient coverage of an anchor is effectively solved by carrying out densification and expansion on an anchor box and introducing anchor _ attribute, the recall rate of in-vehicle human face recognition is effectively improved due to the obvious expansion of the coverage rate of the anchor, and further, the automatic recognition of the in-vehicle human face is realized based on an artificial intelligence means, and the accuracy is higher. The invention also relates to a block chain technology, and the obtained face recognition result can be stored in the block chain. The invention can also be applied to smart traffic scenes, thereby promoting the construction of smart cities.

Description

In-vehicle human face identification method, device, equipment and medium based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence and face recognition, in particular to an in-vehicle face recognition method, an in-vehicle face recognition device, in-vehicle face recognition equipment and an in-vehicle face recognition medium based on artificial intelligence.
Background
In the face of the demands from various industries of society, the application of the face recognition technology is more and more extensive, and the requirement on the accuracy of the face recognition in the vehicle at the traffic intersection is higher and higher.
The traditional FaceBoxes algorithm has the advantages of high detection speed and high precision, and thus becomes one of popular face recognition algorithms in the industry at present.
However, although the FaceBoxes algorithm has better face recognition accuracy, the FaceBoxes algorithm still has the defects of feature redundancy and insufficient anchor coverage, and the defects are particularly obvious when the FaceBoxes identify faces in a similar in-vehicle shielding environment.
Disclosure of Invention
In view of the above, it is necessary to provide an in-vehicle face recognition method, apparatus, device, and medium based on artificial intelligence, which can introduce a channel attention mechanism to further screen a learned feature map, thereby alleviating the problem of feature redundancy, focus attention on a specified area, effectively improve the accuracy of in-vehicle face recognition, effectively solve the problem of insufficient coverage of an anchor by performing densification and expansion on an anchor box and introducing anchor _ attack, and effectively improve the recall rate of in-vehicle face recognition due to the significantly expanded coverage of the anchor, thereby realizing automatic recognition of a face in a vehicle based on an artificial intelligence means, and having higher accuracy.
An in-vehicle face recognition method based on artificial intelligence comprises the following steps:
responding to the received image to be recognized, and performing resize processing on the image to be recognized to obtain a target image;
inputting the target image into a convolution pooling layer for feature extraction, and outputting shallow features of the target image;
inputting the shallow features into a residual error connecting layer for feature extraction, and outputting deep features of the target image;
inputting the deep features into a channel attention layer for importance screening, and outputting a first feature map;
inputting the first characteristic diagram into a first convolution layer for convolution operation, and outputting a second characteristic diagram;
inputting the second feature map into a second convolution layer for convolution operation, and outputting a third feature map;
obtaining an anchor box obtained by pre-clustering, and matching the obtained anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram;
carrying out densification and expansion on the anchor box of each characteristic diagram to obtain an initial anchor box of each characteristic diagram;
utilizing an anchor _ attitude layer to carry out importance screening on the initial anchor box of each characteristic diagram to obtain a target anchor box of each characteristic diagram;
respectively carrying out human face recognition on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinate corresponding to each feature image and the score of each anchor box coordinate, and taking the anchor box coordinate with the highest score as the predicted human face coordinate;
and mapping the predicted face coordinates to the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result to a block chain.
According to a preferred embodiment of the present invention, the inputting the shallow features into a residual connecting layer for feature extraction, and the outputting the deep features of the target image includes:
performing convolution operation and pooling operation on the shallow feature, and outputting at least one operation result;
and splicing the at least one operation result, and outputting the deep features of the target image.
According to a preferred embodiment of the present invention, the inputting the deep features into the channel attention layer for importance screening, and the outputting the first feature map comprises:
inputting the deep features into a global maximum pooling layer for compression, and outputting first compression features of the deep features;
inputting the first compression characteristic into a low-dimensional full-connection layer for compression, and outputting a second compression characteristic;
inputting the second compression characteristic into an original high-dimensional full-connection layer for reduction, and outputting an attention vector;
and performing point multiplication on the attention vector and the deep features, and outputting the first feature map.
According to the preferred embodiment of the present invention, before obtaining an anchor box obtained by pre-clustering, the method for recognizing the human face in the vehicle based on the artificial intelligence further comprises:
acquiring sample data;
randomly acquiring a specified number of points from the sample data as initial sample points;
clustering the sample data by adopting a K-means algorithm to obtain the clusters with the specified number;
and calculating the coordinate of the central point of each cluster as the anchor box obtained by pre-clustering.
According to the preferred embodiment of the present invention, the screening the importance of the initial anchor box of each feature map by using the anchor _ attribute layer to obtain the target anchor box of each feature map includes:
calculating the score of an initial anchor box of each feature map based on an attention mechanism;
sorting the scores of the initial anchor box of each feature map from high to low;
and acquiring an initial anchor box with the score value ranked at the front preset position as a target anchor box of each feature map.
According to a preferred embodiment of the present invention, the mapping the predicted face coordinates onto the image to be recognized to obtain the face recognition result of the image to be recognized includes:
determining an offset;
converting the face coordinates according to the offset to obtain conversion coordinates;
determining a first scale of the image to be recognized and determining a second scale of the feature map corresponding to the face coordinates;
calculating a quotient of the first scale and the second scale as a coefficient;
and multiplying the coefficient and the face coordinate to obtain the corresponding position of the face coordinate on the image to be recognized, and obtaining the face recognition result.
According to the preferred embodiment of the present invention, the method for recognizing human faces in a vehicle based on artificial intelligence further comprises:
responding to a received detection instruction, and determining a terminal corresponding to the detection instruction;
and sending the identification result to the terminal.
The utility model provides an in-vehicle face recognition device based on artificial intelligence, in-vehicle face recognition device based on artificial intelligence includes:
the processing unit is used for responding to the received image to be recognized and carrying out resize processing on the image to be recognized to obtain a target image;
the extraction unit is used for inputting the target image into a convolution pooling layer for feature extraction and outputting shallow features of the target image;
the extraction unit is further configured to input the shallow features to a residual connecting layer for feature extraction, and output deep features of the target image;
the screening unit is used for inputting the deep layer characteristics into a channel attention layer for importance screening and outputting a first characteristic diagram;
the operation unit is used for inputting the first characteristic diagram into the first convolution layer for convolution operation and outputting a second characteristic diagram;
the operation unit is further used for inputting the second feature map into a second convolution layer for convolution operation and outputting a third feature map;
the matching unit is used for acquiring an anchor box obtained by pre-clustering, and matching the acquired anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram;
the expansion unit is used for carrying out densification expansion on the anchor box of each characteristic diagram to obtain an initial anchor box of each characteristic diagram;
the screening unit is further used for screening the importance of the initial anchor box of each feature map by using the anchor _ attribute layer to obtain a target anchor box of each feature map;
the recognition unit is used for performing face recognition on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinate corresponding to each feature image and the score of each anchor box coordinate, and taking the anchor box coordinate with the highest score as the predicted face coordinate;
and the mapping unit is used for mapping the predicted face coordinates to the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result to a block chain.
An electronic device, the electronic device comprising:
a memory storing at least one instruction; and
and the processor executes the instructions stored in the memory to realize the artificial intelligence based in-vehicle face recognition method.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in an electronic device to implement the artificial intelligence based in-vehicle face recognition method.
According to the technical scheme, the method can respond to the received image to be recognized, resize processing is carried out on the image to be recognized to obtain a target image, the target image is input to a convolution pooling layer to carry out feature extraction, shallow features of the target image are output, the shallow features are input to a residual error connecting layer to carry out feature extraction, deep features of the target image are output, the deep features are further input to a channel attention layer to carry out importance screening, a first feature map is output, a channel attention mechanism is introduced to further screen the learned first feature map, so that the problem of feature redundancy is relieved, attention is focused on a designated area, the accuracy of in-vehicle face recognition is effectively improved, the first feature map is input to a first convolution layer to carry out convolution operation, a second feature map is output, and the second feature map is input to a second convolution layer to carry out convolution operation, outputting a third feature map, obtaining an anchor box obtained by pre-clustering, matching the obtained anchor box with the first feature map, the second feature map and the third feature map according to the scale to obtain an anchor box of each feature map, performing densification expansion on the anchor box of each feature map to obtain an initial anchor box of each feature map, performing importance screening on the initial anchor box of each feature map by using an anchor _ attribute layer to obtain a target anchor box of each feature map, further performing densification expansion on the anchor boxes and introducing anchor _ attribute to effectively solve the problem of insufficient anchor coverage, effectively improve the face recognition rate in the vehicle due to the obviously expanded anchor coverage rate, further utilizing the target anchor boxes of each feature map to respectively recognize pedestrians on the corresponding feature maps, and outputting coordinates and each anchor box coordinate corresponding to each feature map, and taking the anchor box coordinate with the highest score as a predicted face coordinate, mapping the predicted face coordinate to the image to be recognized, obtaining a face recognition result of the image to be recognized and storing the face recognition result in a block chain, realizing automatic recognition of the face in the car based on an artificial intelligence means, and having higher accuracy.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the method for recognizing human faces in a vehicle based on artificial intelligence of the present invention.
Fig. 2 is a functional block diagram of a preferred embodiment of the artificial intelligence based in-vehicle face recognition apparatus of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing an artificial intelligence-based in-vehicle face recognition method.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a preferred embodiment of the method for recognizing human faces in a vehicle based on artificial intelligence according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The method for recognizing the human face in the vehicle based on the artificial intelligence is applied to one or more electronic devices, wherein the electronic devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and hardware of the electronic devices includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), an intelligent wearable device, and the like.
The electronic device may also include a network device and/or a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a cloud computing (cloud computing) based cloud consisting of a large number of hosts or network servers.
The Network where the electronic device is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
S10, responding to the received image to be recognized, and performing resize processing on the image to be recognized to obtain a target image.
Through resize processing, the image to be recognized meets the requirement of the model for the size of the image, and automatic recognition by the model is convenient to follow-up use.
For example: after resize processing, the size of the target image may be 1024 × 1024 or 512 × 512, and the like, and may be configured according to actual requirements.
And S11, inputting the target image into a convolution pooling layer for feature extraction, and outputting shallow features of the target image.
The shallow feature refers to simple features such as edges and colors.
In this embodiment, two convolution pooling layers may be included.
For example: two different convolution pooling layers conv _ pool1 and conv _ pool2 may be included, conv _ pool1 may be a 7 × 7 convolution +3 × 3 pooled structure, conv _ pool2 may be a 5 × 5 convolution +3 × 3 pooled structure.
The structure of the convolution pooling layer may be configured according to actual requirements, and the invention is not limited herein.
And S12, inputting the shallow features into a residual connecting layer for feature extraction, and outputting the deep features of the target image.
Wherein, the deep features refer to complex features such as textures and the like.
In this embodiment, the inputting the shallow features into a residual connecting layer for feature extraction, and the outputting the deep features of the target image includes:
performing convolution operation and pooling operation on the shallow feature, and outputting at least one operation result;
and splicing the at least one operation result, and outputting the deep features of the target image.
For example: when the size of the matrix of the shallow features is 100 × 3, the matrix of 100 × 3 of the shallow features may be subjected to a convolution operation by Conv (1 × 32) to obtain a matrix of 100 × 32, after the operations by Pool (3 × 3) and Conv (1 × 32), a matrix of 100 × 32 is obtained, after the operations by Conv (1 × 24) and Conv (3 × 32), a matrix of 100 × 32 is also obtained, and finally, after the operations by Conv (1 × 24), Conv (3 × 32) and Conv (3 × 32), a matrix of 100 × 32 is obtained, and the four matrices of 100 × 32 are spliced to obtain a one-dimensional target image (128 + 32).
By continuously performing convolution and pooling operation on the shallow features, deeper feature extraction can be further realized, further deep features of the target image can be obtained, and the accuracy of face recognition is improved.
And S13, inputting the deep features into a channel attention layer for importance screening, and outputting a first feature map.
In this embodiment, the inputting the deep features into the channel attention layer for importance screening, and the outputting the first feature map includes:
inputting the deep features into a global maximum pooling layer for compression, and outputting first compression features of the deep features;
inputting the first compression characteristic into a low-dimensional full-connection layer for compression, and outputting a second compression characteristic;
inputting the second compression characteristic into an original high-dimensional full-connection layer for reduction, and outputting an attention vector;
and performing point multiplication on the attention vector and the deep features, and outputting the first feature map.
Through the embodiment, the channel attention mechanism is introduced to further screen the learned first feature map, so that the feature redundancy problem is relieved.
Meanwhile, after the channel attention mechanism is added, the redundancy of the characteristics can be effectively reduced, attention is focused on a designated area, such as a face area in the automobile, but not other invalid areas, and the accuracy of face identification in the automobile is effectively improved.
S14, inputting the first characteristic diagram into the first convolution layer to carry out convolution operation, and outputting a second characteristic diagram.
For example: the first convolution layer may be a 1 x 1 convolution.
After passing through the first convolution layer, the scale of the output second feature map is smaller than that of the first feature map.
And S15, inputting the second feature map into a second convolution layer for convolution operation, and outputting a third feature map.
For example: the second convolution layer may be a 3 x 3 convolution.
After passing through the second convolutional layer, the scale of the output third feature map is smaller than that of the second feature map.
Namely: the scale is reduced each time a convolution operation is performed.
And S16, obtaining an anchor box obtained by pre-clustering, and matching the obtained anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram.
In at least one embodiment of the present invention, before obtaining a pre-clustered anchor box, the in-vehicle face recognition method further includes:
acquiring sample data;
randomly acquiring a specified number of points from the sample data as initial sample points;
clustering the sample data by adopting a K-means algorithm to obtain the clusters with the specified number;
and calculating the coordinate of the central point of each cluster as the anchor box obtained by pre-clustering.
The specified number can be 6, 9, 12, etc., and is a multiple of 3, so as to ensure that the feature maps of each scale have the same number of anchor boxes.
Specifically, in the clustering process, the distance function that can be used is: distance =1-iou, where iou = the intersection area size of two boxes/union area size of two boxes.
Further, after the anchor boxes are clustered to obtain the anchor boxes, determining the scale of each anchor box, sorting the anchor boxes according to the size of the scale, dividing the anchor boxes into 3 groups according to the sequence of the scales from large to small, matching the group of anchor boxes with the largest scale with the first feature map, matching the group of anchor boxes with the middle scale with the second feature map, and matching the group of anchor boxes with the smallest scale with the third feature map.
And S17, carrying out densification and expansion on the anchor box of each feature map to obtain an initial anchor box of each feature map.
Specifically, the anchor box of each feature map may be used to perform the epitaxial growth to the right and downward on each feature map, so as to obtain the initial anchor box of each feature map.
And S18, performing importance screening on the initial anchor box of each feature map by using the anchor _ orientation layer to obtain a target anchor box of each feature map.
Specifically, the screening of the importance of the initial anchor box of each feature map by using the anchor _ attribute layer to obtain the target anchor box of each feature map includes:
calculating the score of an initial anchor box of each feature map based on an attention mechanism;
sorting the scores of the initial anchor box of each feature map from high to low;
and acquiring an initial anchor box with the score value ranked at the front preset position as a target anchor box of each feature map.
The former preset bit is related to the specified number, and the former preset bit is a quotient of the specified number and 3.
By carrying out densification and expansion on the anchor box and introducing anchor _ attention (anchor attention mechanism), the problem of insufficient anchor coverage is effectively solved.
Moreover, after the anchor _ attribution is introduced, the coverage rate of the anchor is obviously increased, and the recall rate of the face recognition in the vehicle is effectively improved.
And S19, recognizing human faces on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinates corresponding to each feature image and the scores of the anchor box coordinates, and taking the anchor box coordinates with the highest scores as the predicted human face coordinates.
Through the embodiment, the anchor box coordinate with the highest score can be acquired from the anchor box coordinates corresponding to each output feature map and serves as the predicted face coordinate, further screening is carried out according to the score, and the accuracy of face recognition is improved again.
And S20, mapping the predicted face coordinates onto the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result into a block chain.
It will be appreciated that the feature map is obtained after a number of convolutions, so the size of the feature map is compressed (e.g. 1/32 where the feature map is compressed to the original image to be identified), and the predicted coordinates are based on the feature map and therefore need to be mapped to the size of the original image.
In this embodiment, the mapping the predicted face coordinates onto the image to be recognized to obtain the face recognition result of the image to be recognized includes:
determining an offset;
converting the face coordinates according to the offset to obtain conversion coordinates;
determining a first scale of the image to be recognized and determining a second scale of the feature map corresponding to the face coordinates;
calculating a quotient of the first scale and the second scale as a coefficient;
and multiplying the coefficient and the face coordinate to obtain the corresponding position of the face coordinate on the image to be recognized, and obtaining the face recognition result.
Through the embodiment, the recognized face coordinates are mapped on the original image, so that the user can conveniently check the recognition result.
Further, the in-vehicle face recognition method further includes:
responding to a received detection instruction, and determining a terminal corresponding to the detection instruction;
and sending the identification result to the terminal.
After the recognition result is sent to the terminal, the recognition result can be used for assisting in performing tasks related to face recognition, such as company card punching through face recognition, criminal suspect tracking through face recognition, identity login verification through face recognition and the like.
According to the technical scheme, the method can respond to the received image to be recognized, resize processing is carried out on the image to be recognized to obtain a target image, the target image is input to a convolution pooling layer to carry out feature extraction, shallow features of the target image are output, the shallow features are input to a residual error connecting layer to carry out feature extraction, deep features of the target image are output, the deep features are further input to a channel attention layer to carry out importance screening, a first feature map is output, a channel attention mechanism is introduced to further screen the learned first feature map, so that the problem of feature redundancy is relieved, attention is focused on a designated area, the accuracy of in-vehicle face recognition is effectively improved, the first feature map is input to a first convolution layer to carry out convolution operation, a second feature map is output, and the second feature map is input to a second convolution layer to carry out convolution operation, outputting a third feature map, obtaining an anchor box obtained by pre-clustering, matching the obtained anchor box with the first feature map, the second feature map and the third feature map according to the scale to obtain an anchor box of each feature map, performing densification expansion on the anchor box of each feature map to obtain an initial anchor box of each feature map, performing importance screening on the initial anchor box of each feature map by using an anchor _ attribute layer to obtain a target anchor box of each feature map, further performing densification expansion on the anchor boxes and introducing anchor _ attribute to effectively solve the problem of insufficient anchor coverage, effectively improve the face recognition rate in the vehicle due to the obviously expanded anchor coverage rate, further utilizing the target anchor boxes of each feature map to respectively recognize pedestrians on the corresponding feature maps, and outputting coordinates and each anchor box coordinate corresponding to each feature map, and taking the anchor box coordinate with the highest score as a predicted face coordinate, mapping the predicted face coordinate to the image to be recognized, obtaining a face recognition result of the image to be recognized and storing the face recognition result in a block chain, realizing automatic recognition of the face in the car based on an artificial intelligence means, and having higher accuracy.
Fig. 2 is a functional block diagram of a preferred embodiment of the in-vehicle human face recognition device based on artificial intelligence according to the present invention. The artificial intelligence based in-vehicle human face recognition device 11 comprises a processing unit 110, an extraction unit 111, a screening unit 112, an operation unit 113, a matching unit 114, an expansion unit 115, a recognition unit 116, a mapping unit 117, an acquisition unit 118, a clustering unit 119, a calculation unit 120, a determination unit 121 and a sending unit 122. The module/unit referred to in the present invention refers to a series of computer program segments that can be executed by the processor 13 and that can perform a fixed function, and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
In response to the received image to be recognized, the processing unit 110 performs resize processing on the image to be recognized to obtain a target image.
Through resize processing, the image to be recognized meets the requirement of the model for the size of the image, and automatic recognition by the model is convenient to follow-up use.
For example: after resize processing, the size of the target image may be 1024 × 1024 or 512 × 512, and the like, and may be configured according to actual requirements.
The extraction unit 111 inputs the target image to a convolution pooling layer to perform feature extraction, and outputs shallow features of the target image.
The shallow feature refers to simple features such as edges and colors.
In this embodiment, two convolution pooling layers may be included.
For example: two different convolution pooling layers conv _ pool1 and conv _ pool2 may be included, conv _ pool1 may be a 7 × 7 convolution +3 × 3 pooled structure, conv _ pool2 may be a 5 × 5 convolution +3 × 3 pooled structure.
The structure of the convolution pooling layer may be configured according to actual requirements, and the invention is not limited herein.
The extraction unit 111 inputs the shallow features to a residual connecting layer for feature extraction, and outputs deep features of the target image.
Wherein, the deep features refer to complex features such as textures and the like.
In this embodiment, the extracting unit 111 inputs the shallow feature to a residual connecting layer for feature extraction, and outputting the deep feature of the target image includes:
performing convolution operation and pooling operation on the shallow feature, and outputting at least one operation result;
and splicing the at least one operation result, and outputting the deep features of the target image.
For example: when the size of the matrix of the shallow features is 100 × 3, the matrix of 100 × 3 of the shallow features may be subjected to a convolution operation by Conv (1 × 32) to obtain a matrix of 100 × 32, after the operations by Pool (3 × 3) and Conv (1 × 32), a matrix of 100 × 32 is obtained, after the operations by Conv (1 × 24) and Conv (3 × 32), a matrix of 100 × 32 is also obtained, and finally, after the operations by Conv (1 × 24), Conv (3 × 32) and Conv (3 × 32), a matrix of 100 × 32 is obtained, and the four matrices of 100 × 32 are spliced to obtain a one-dimensional target image (128 + 32).
By continuously performing convolution and pooling operation on the shallow features, deeper feature extraction can be further realized, further deep features of the target image can be obtained, and the accuracy of face recognition is improved.
The screening unit 112 inputs the deep features into the channel attention layer for importance screening, and outputs a first feature map.
In this embodiment, the screening unit 112 inputs the deep features into the channel attention layer for importance screening, and outputting the first feature map includes:
inputting the deep features into a global maximum pooling layer for compression, and outputting first compression features of the deep features;
inputting the first compression characteristic into a low-dimensional full-connection layer for compression, and outputting a second compression characteristic;
inputting the second compression characteristic into an original high-dimensional full-connection layer for reduction, and outputting an attention vector;
and performing point multiplication on the attention vector and the deep features, and outputting the first feature map.
Through the embodiment, the channel attention mechanism is introduced to further screen the learned first feature map, so that the feature redundancy problem is relieved.
Meanwhile, after the channel attention mechanism is added, the redundancy of the characteristics can be effectively reduced, attention is focused on a designated area, such as a face area in the automobile, but not other invalid areas, and the accuracy of face identification in the automobile is effectively improved.
The operation unit 113 inputs the first feature map to the first convolution layer to perform convolution operation, and outputs a second feature map.
For example: the first convolution layer may be a 1 x 1 convolution.
After passing through the first convolution layer, the scale of the output second feature map is smaller than that of the first feature map.
The arithmetic unit 113 inputs the second feature map to the second convolution layer to perform convolution operation, and outputs a third feature map.
For example: the second convolution layer may be a 3 x 3 convolution.
After passing through the second convolutional layer, the scale of the output third feature map is smaller than that of the second feature map.
Namely: the scale is reduced each time a convolution operation is performed.
The matching unit 114 obtains an anchor box obtained by pre-clustering, and matches the obtained anchor box with the first feature map, the second feature map and the third feature map according to the scale to obtain the anchor box of each feature map.
In at least one embodiment of the present invention, the obtaining unit 118 obtains sample data before obtaining an anchor box obtained by clustering in advance;
the acquiring unit 118 randomly acquires a specified number of points from the sample data as initial sample points;
the clustering unit 119 clusters the sample data by using a K-means algorithm to obtain the clusters with the specified number;
the calculation unit 120 calculates the coordinates of the center point of each cluster as the anchor box obtained by the pre-clustering.
The specified number can be 6, 9, 12, etc., and is a multiple of 3, so as to ensure that the feature maps of each scale have the same number of anchor boxes.
Specifically, in the clustering process, the distance function that can be used is: distance =1-iou, where iou = the intersection area size of two boxes/union area size of two boxes.
Further, after the anchor boxes are clustered to obtain the anchor boxes, determining the scale of each anchor box, sorting the anchor boxes according to the size of the scale, dividing the anchor boxes into 3 groups according to the sequence of the scales from large to small, matching the group of anchor boxes with the largest scale with the first feature map, matching the group of anchor boxes with the middle scale with the second feature map, and matching the group of anchor boxes with the smallest scale with the third feature map.
The expanding unit 115 performs densification and expansion on the anchor box of each feature map to obtain an initial anchor box of each feature map.
Specifically, the anchor box of each feature map may be used to perform the epitaxial growth to the right and downward on each feature map, so as to obtain the initial anchor box of each feature map.
The screening unit 112 performs importance screening on the initial anchor box of each feature map by using the anchor _ orientation layer to obtain a target anchor box of each feature map.
Specifically, the screening unit 112 performs importance screening on the initial anchor box of each feature map by using the anchor _ attribute layer, and obtaining the target anchor box of each feature map includes:
calculating the score of an initial anchor box of each feature map based on an attention mechanism;
sorting the scores of the initial anchor box of each feature map from high to low;
and acquiring an initial anchor box with the score value ranked at the front preset position as a target anchor box of each feature map.
The former preset bit is related to the specified number, and the former preset bit is a quotient of the specified number and 3.
By carrying out densification and expansion on the anchor box and introducing anchor _ attention (anchor attention mechanism), the problem of insufficient anchor coverage is effectively solved.
Moreover, after the anchor _ attribution is introduced, the coverage rate of the anchor is obviously increased, and the recall rate of the face recognition in the vehicle is effectively improved.
The recognition unit 116 performs face recognition on the corresponding feature map by using the target anchor box of each feature map, outputs the anchor box coordinate corresponding to each feature map and the score of each anchor box coordinate, and takes the anchor box coordinate with the highest score as the predicted face coordinate.
Through the embodiment, the anchor box coordinate with the highest score can be acquired from the anchor box coordinates corresponding to each output feature map and serves as the predicted face coordinate, further screening is carried out according to the score, and the accuracy of face recognition is improved again.
The mapping unit 117 maps the predicted face coordinates to the image to be recognized, obtains a face recognition result of the image to be recognized, and stores the face recognition result in a block chain.
It will be appreciated that the feature map is obtained after a number of convolutions, so the size of the feature map is compressed (e.g. 1/32 where the feature map is compressed to the original image to be identified), and the predicted coordinates are based on the feature map and therefore need to be mapped to the size of the original image.
In this embodiment, the mapping unit 117 maps the predicted face coordinates onto the image to be recognized, and obtaining the face recognition result of the image to be recognized includes:
determining an offset;
converting the face coordinates according to the offset to obtain conversion coordinates;
determining a first scale of the image to be recognized and determining a second scale of the feature map corresponding to the face coordinates;
calculating a quotient of the first scale and the second scale as a coefficient;
and multiplying the coefficient and the face coordinate to obtain the corresponding position of the face coordinate on the image to be recognized, and obtaining the face recognition result.
Through the embodiment, the recognized face coordinates are mapped on the original image, so that the user can conveniently check the recognition result.
Further, the determination unit 121 determines, in response to the received detection instruction, a terminal corresponding to the detection instruction;
the transmission unit 122 transmits the recognition result to the terminal.
After the recognition result is sent to the terminal, the recognition result can be used for assisting in performing tasks related to face recognition, such as company card punching through face recognition, criminal suspect tracking through face recognition, identity login verification through face recognition and the like.
According to the technical scheme, the method can respond to the received image to be recognized, resize processing is carried out on the image to be recognized to obtain a target image, the target image is input to a convolution pooling layer to carry out feature extraction, shallow features of the target image are output, the shallow features are input to a residual error connecting layer to carry out feature extraction, deep features of the target image are output, the deep features are further input to a channel attention layer to carry out importance screening, a first feature map is output, a channel attention mechanism is introduced to further screen the learned first feature map, so that the problem of feature redundancy is relieved, attention is focused on a designated area, the accuracy of in-vehicle face recognition is effectively improved, the first feature map is input to a first convolution layer to carry out convolution operation, a second feature map is output, and the second feature map is input to a second convolution layer to carry out convolution operation, outputting a third feature map, obtaining an anchor box obtained by pre-clustering, matching the obtained anchor box with the first feature map, the second feature map and the third feature map according to the scale to obtain an anchor box of each feature map, performing densification expansion on the anchor box of each feature map to obtain an initial anchor box of each feature map, performing importance screening on the initial anchor box of each feature map by using an anchor _ attribute layer to obtain a target anchor box of each feature map, further performing densification expansion on the anchor boxes and introducing anchor _ attribute to effectively solve the problem of insufficient anchor coverage, effectively improve the face recognition rate in the vehicle due to the obviously expanded anchor coverage rate, further utilizing the target anchor boxes of each feature map to respectively recognize pedestrians on the corresponding feature maps, and outputting coordinates and each anchor box coordinate corresponding to each feature map, and taking the anchor box coordinate with the highest score as a predicted face coordinate, mapping the predicted face coordinate to the image to be recognized, obtaining a face recognition result of the image to be recognized and storing the face recognition result in a block chain, realizing automatic recognition of the face in the car based on an artificial intelligence means, and having higher accuracy.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing an artificial intelligence-based in-vehicle human face recognition method.
The electronic device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program, such as an artificial intelligence based in-vehicle facial recognition program, stored in the memory 12 and executable on the processor 13.
It will be understood by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-type structure, the electronic device 1 may further include more or less hardware or software than those shown in the figures, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.
It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
The memory 12 includes at least one type of readable storage medium, which includes flash memory, removable hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the electronic device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of an artificial intelligence-based in-vehicle face recognition program, etc., but also to temporarily store data that has been output or is to be output.
The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 13 is a Control Unit (Control Unit) of the electronic device 1, connects various components of the electronic device 1 by various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules stored in the memory 12 (for example, executing an artificial intelligence-based in-vehicle face recognition program and the like) and calling data stored in the memory 12.
The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps of the various artificial intelligence based in-vehicle face recognition method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the electronic device 1. For example, the computer program may be divided into a processing unit 110, an extraction unit 111, a filtering unit 112, an arithmetic unit 113, a matching unit 114, an extension unit 115, a recognition unit 116, a mapping unit 117, an acquisition unit 118, a clustering unit 119, a calculation unit 120, a determination unit 121, and a transmission unit 122.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device) or a processor (processor) to execute parts of the artificial intelligence based in-vehicle face recognition method according to the embodiments of the present invention.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Further, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 13 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard (Keyboard)), optionally, a standard wired interface, a wireless interface, optionally, in some embodiments, the Display may be an L ED Display, a liquid crystal Display, a touch-sensitive liquid crystal Display, an O L ED (Organic light-Emitting Diode) touch-sensitive device, etc.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
Fig. 3 only shows the electronic device 1 with components 12-13, and it will be understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
Referring to fig. 1, the memory 12 of the electronic device 1 stores a plurality of instructions to implement an artificial intelligence based in-vehicle face recognition method, and the processor 13 can execute the plurality of instructions to implement:
responding to the received image to be recognized, and performing resize processing on the image to be recognized to obtain a target image;
inputting the target image into a convolution pooling layer for feature extraction, and outputting shallow features of the target image;
inputting the shallow features into a residual error connecting layer for feature extraction, and outputting deep features of the target image;
inputting the deep features into a channel attention layer for importance screening, and outputting a first feature map;
inputting the first characteristic diagram into a first convolution layer for convolution operation, and outputting a second characteristic diagram;
inputting the second feature map into a second convolution layer for convolution operation, and outputting a third feature map;
obtaining an anchor box obtained by pre-clustering, and matching the obtained anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram;
carrying out densification and expansion on the anchor box of each characteristic diagram to obtain an initial anchor box of each characteristic diagram;
utilizing an anchor _ attitude layer to carry out importance screening on the initial anchor box of each characteristic diagram to obtain a target anchor box of each characteristic diagram;
respectively carrying out human face recognition on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinate corresponding to each feature image and the score of each anchor box coordinate, and taking the anchor box coordinate with the highest score as the predicted human face coordinate;
and mapping the predicted face coordinates to the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result to a block chain.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. An in-vehicle face recognition method based on artificial intelligence is characterized by comprising the following steps:
responding to the received image to be recognized, and performing resize processing on the image to be recognized to obtain a target image;
inputting the target image into a convolution pooling layer for feature extraction, and outputting shallow features of the target image;
inputting the shallow features into a residual error connecting layer for feature extraction, and outputting deep features of the target image;
inputting the deep features into a channel attention layer for importance screening, and outputting a first feature map;
inputting the first characteristic diagram into a first convolution layer for convolution operation, and outputting a second characteristic diagram;
inputting the second feature map into a second convolution layer for convolution operation, and outputting a third feature map;
obtaining an anchor box obtained by pre-clustering, and matching the obtained anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram;
carrying out densification and expansion on the anchor box of each characteristic diagram to obtain an initial anchor box of each characteristic diagram;
utilizing an anchor _ attitude layer to carry out importance screening on the initial anchor box of each characteristic diagram to obtain a target anchor box of each characteristic diagram;
respectively carrying out human face recognition on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinate corresponding to each feature image and the score of each anchor box coordinate, and taking the anchor box coordinate with the highest score as the predicted human face coordinate;
and mapping the predicted face coordinates to the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result to a block chain.
2. The artificial intelligence based in-vehicle face recognition method of claim 1, wherein the inputting the shallow features into a residual connecting layer for feature extraction, and the outputting the deep features of the target image comprises:
performing convolution operation and pooling operation on the shallow feature, and outputting at least one operation result;
and splicing the at least one operation result, and outputting the deep features of the target image.
3. The artificial intelligence based in-vehicle face recognition method of claim 1, wherein the inputting the deep features into a channel attention layer for importance screening, and the outputting the first feature map comprises:
inputting the deep features into a global maximum pooling layer for compression, and outputting first compression features of the deep features;
inputting the first compression characteristic into a low-dimensional full-connection layer for compression, and outputting a second compression characteristic;
inputting the second compression characteristic into an original high-dimensional full-connection layer for reduction, and outputting an attention vector;
and performing point multiplication on the attention vector and the deep features, and outputting the first feature map.
4. The artificial intelligence based in-vehicle face recognition method according to claim 1, wherein before obtaining the anchor box obtained by pre-clustering, the artificial intelligence based in-vehicle face recognition method further comprises:
acquiring sample data;
randomly acquiring a specified number of points from the sample data as initial sample points;
clustering the sample data by adopting a K-means algorithm to obtain the clusters with the specified number;
and calculating the coordinate of the central point of each cluster as the anchor box obtained by pre-clustering.
5. The artificial intelligence based in-vehicle face recognition method according to claim 3, wherein the performing importance screening on the initial anchor box of each feature map by using an anchor _ attribute layer to obtain the target anchor box of each feature map comprises:
calculating the score of an initial anchor box of each feature map based on an attention mechanism;
sorting the scores of the initial anchor box of each feature map from high to low;
and acquiring an initial anchor box with the score value ranked at the front preset position as a target anchor box of each feature map.
6. The artificial intelligence based in-vehicle face recognition method according to claim 1, wherein the mapping the predicted face coordinates onto the image to be recognized to obtain the face recognition result of the image to be recognized comprises:
determining an offset;
converting the face coordinates according to the offset to obtain conversion coordinates;
determining a first scale of the image to be recognized and determining a second scale of the feature map corresponding to the face coordinates;
calculating a quotient of the first scale and the second scale as a coefficient;
and multiplying the coefficient and the face coordinate to obtain the corresponding position of the face coordinate on the image to be recognized, and obtaining the face recognition result.
7. The artificial intelligence based in-vehicle face recognition method of claim 1, wherein the artificial intelligence based in-vehicle face recognition method further comprises:
responding to a received detection instruction, and determining a terminal corresponding to the detection instruction;
and sending the identification result to the terminal.
8. The utility model provides an in-vehicle face recognition device based on artificial intelligence, a serial communication port, in-vehicle face recognition device based on artificial intelligence includes:
the processing unit is used for responding to the received image to be recognized and carrying out resize processing on the image to be recognized to obtain a target image;
the extraction unit is used for inputting the target image into a convolution pooling layer for feature extraction and outputting shallow features of the target image;
the extraction unit is further configured to input the shallow features to a residual connecting layer for feature extraction, and output deep features of the target image;
the screening unit is used for inputting the deep layer characteristics into a channel attention layer for importance screening and outputting a first characteristic diagram;
the operation unit is used for inputting the first characteristic diagram into the first convolution layer for convolution operation and outputting a second characteristic diagram;
the operation unit is further used for inputting the second feature map into a second convolution layer for convolution operation and outputting a third feature map;
the matching unit is used for acquiring an anchor box obtained by pre-clustering, and matching the acquired anchor box with the first characteristic diagram, the second characteristic diagram and the third characteristic diagram according to the scale to obtain the anchor box of each characteristic diagram;
the expansion unit is used for carrying out densification expansion on the anchor box of each characteristic diagram to obtain an initial anchor box of each characteristic diagram;
the screening unit is further used for screening the importance of the initial anchor box of each feature map by using the anchor _ attribute layer to obtain a target anchor box of each feature map;
the recognition unit is used for performing face recognition on the corresponding feature images by using the target anchor box of each feature image, outputting the anchor box coordinate corresponding to each feature image and the score of each anchor box coordinate, and taking the anchor box coordinate with the highest score as the predicted face coordinate;
and the mapping unit is used for mapping the predicted face coordinates to the image to be recognized to obtain a face recognition result of the image to be recognized and storing the face recognition result to a block chain.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing at least one instruction; and
a processor executing instructions stored in the memory to implement the artificial intelligence based in-vehicle face recognition method according to any one of claims 1 to 7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein at least one instruction that is executable by a processor in an electronic device to implement the artificial intelligence based in-vehicle face recognition method according to any one of claims 1 to 7.
CN202010593969.XA 2020-06-28 2020-06-28 In-vehicle human face identification method, device, equipment and medium based on artificial intelligence Active CN111476225B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010593969.XA CN111476225B (en) 2020-06-28 2020-06-28 In-vehicle human face identification method, device, equipment and medium based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010593969.XA CN111476225B (en) 2020-06-28 2020-06-28 In-vehicle human face identification method, device, equipment and medium based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN111476225A true CN111476225A (en) 2020-07-31
CN111476225B CN111476225B (en) 2020-10-02

Family

ID=71765386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010593969.XA Active CN111476225B (en) 2020-06-28 2020-06-28 In-vehicle human face identification method, device, equipment and medium based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN111476225B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132033A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112132140A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle brand identification method, device, equipment and medium based on artificial intelligence
CN112215179A (en) * 2020-10-19 2021-01-12 平安国际智慧城市科技股份有限公司 In-vehicle face recognition method, device, apparatus and storage medium
CN114359645A (en) * 2022-01-12 2022-04-15 中国平安人寿保险股份有限公司 Image expansion method, device, equipment and storage medium based on characteristic region

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109457A (en) * 2019-04-29 2019-08-09 北方民族大学 A kind of intelligent sound blind-guidance robot control method and control system
US20190258925A1 (en) * 2018-02-20 2019-08-22 Adobe Inc. Performing attribute-aware based tasks via an attention-controlled neural network
CN110751185A (en) * 2019-09-26 2020-02-04 高新兴科技集团股份有限公司 Training method and device of target detection model
CN110889332A (en) * 2019-10-30 2020-03-17 中国科学院自动化研究所南京人工智能芯片创新研究院 Lie detection method based on micro expression in interview

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190258925A1 (en) * 2018-02-20 2019-08-22 Adobe Inc. Performing attribute-aware based tasks via an attention-controlled neural network
CN110109457A (en) * 2019-04-29 2019-08-09 北方民族大学 A kind of intelligent sound blind-guidance robot control method and control system
CN110751185A (en) * 2019-09-26 2020-02-04 高新兴科技集团股份有限公司 Training method and device of target detection model
CN110889332A (en) * 2019-10-30 2020-03-17 中国科学院自动化研究所南京人工智能芯片创新研究院 Lie detection method based on micro expression in interview

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHENG CHI.ETC: "Selective Refinement Network for High Performance Face Detection", 《THE THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *
JIANFENG WANG.ETC: "Face Attention Network: An Effective Face Detector for the Occluded Faces", 《ARXIV.ORG》 *
YUANYUAN XU.ETC: "CenterFace: Joint Face Detection and Alignment Using Face as Point", 《ARXIV.ORG》 *
李慕锴: "基于深度学习的小尺度红外行人检测技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132033A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112132140A (en) * 2020-09-23 2020-12-25 平安国际智慧城市科技股份有限公司 Vehicle brand identification method, device, equipment and medium based on artificial intelligence
CN112132140B (en) * 2020-09-23 2022-08-12 平安国际智慧城市科技股份有限公司 Vehicle brand identification method, device, equipment and medium based on artificial intelligence
CN112132033B (en) * 2020-09-23 2023-10-10 平安国际智慧城市科技股份有限公司 Vehicle type recognition method and device, electronic equipment and storage medium
CN112215179A (en) * 2020-10-19 2021-01-12 平安国际智慧城市科技股份有限公司 In-vehicle face recognition method, device, apparatus and storage medium
CN112215179B (en) * 2020-10-19 2024-04-19 平安国际智慧城市科技股份有限公司 In-vehicle face recognition method, device, apparatus and storage medium
CN114359645A (en) * 2022-01-12 2022-04-15 中国平安人寿保险股份有限公司 Image expansion method, device, equipment and storage medium based on characteristic region
CN114359645B (en) * 2022-01-12 2024-05-21 中国平安人寿保险股份有限公司 Image expansion method, device, equipment and storage medium based on characteristic area

Also Published As

Publication number Publication date
CN111476225B (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111476225B (en) In-vehicle human face identification method, device, equipment and medium based on artificial intelligence
CN111738212B (en) Traffic signal lamp identification method, device, equipment and medium based on artificial intelligence
CN111652845A (en) Abnormal cell automatic labeling method and device, electronic equipment and storage medium
CN112699775A (en) Certificate identification method, device and equipment based on deep learning and storage medium
CN111931729B (en) Pedestrian detection method, device, equipment and medium based on artificial intelligence
CN112396005A (en) Biological characteristic image recognition method and device, electronic equipment and readable storage medium
CN112052850A (en) License plate recognition method and device, electronic equipment and storage medium
CN113705462B (en) Face recognition method, device, electronic equipment and computer readable storage medium
CN111309825A (en) Data clustering storage method and device and computer readable storage medium
CN112668575A (en) Key information extraction method and device, electronic equipment and storage medium
CN114708461A (en) Multi-modal learning model-based classification method, device, equipment and storage medium
CN112132033B (en) Vehicle type recognition method and device, electronic equipment and storage medium
CN112017763B (en) Medical image data transmission method, device, equipment and medium
CN112329666A (en) Face recognition method and device, electronic equipment and storage medium
CN113157739A (en) Cross-modal retrieval method and device, electronic equipment and storage medium
CN111651625A (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN112541902A (en) Similar area searching method, similar area searching device, electronic equipment and medium
CN115239958A (en) Wheel hub damage detection method based on artificial intelligence and related equipment
CN111488876B (en) License plate recognition method, device, equipment and medium based on artificial intelligence
CN112132037B (en) Pavement detection method, device, equipment and medium based on artificial intelligence
CN112132140B (en) Vehicle brand identification method, device, equipment and medium based on artificial intelligence
CN112529086B (en) Stop line generation method, electronic device, and storage medium
CN111652226B (en) Picture-based target identification method and device and readable storage medium
CN115983865A (en) Service navigation method, device, equipment and storage medium based on voice recognition
CN113284047A (en) Target object segmentation method, device, equipment and storage medium based on multiple features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant