CN110399795B - Face tracking method and device in video, computer equipment and storage medium - Google Patents

Face tracking method and device in video, computer equipment and storage medium Download PDF

Info

Publication number
CN110399795B
CN110399795B CN201910537929.0A CN201910537929A CN110399795B CN 110399795 B CN110399795 B CN 110399795B CN 201910537929 A CN201910537929 A CN 201910537929A CN 110399795 B CN110399795 B CN 110399795B
Authority
CN
China
Prior art keywords
face
matrix
matching
information
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910537929.0A
Other languages
Chinese (zh)
Other versions
CN110399795A (en
Inventor
张磊
宋晨
李雪冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910537929.0A priority Critical patent/CN110399795B/en
Priority to PCT/CN2019/103541 priority patent/WO2020252928A1/en
Publication of CN110399795A publication Critical patent/CN110399795A/en
Application granted granted Critical
Publication of CN110399795B publication Critical patent/CN110399795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The application provides a face tracking method, a device, computer equipment and a storage medium in video, wherein a video frame to be tracked is acquired through the front end of a system, and face images of all first faces of the video frame are acquired; inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information; all the first face feature information is sent to the back end of the system, and the first face feature information corresponding to each first face is matched with faces preset in a storage library at the back end of the system through the back end of the system; if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face; and loading the information of the second face on a display platform in the form of an image through the back end of the system for display. The matching speed of tracking the face is improved, and the running load of the rear end of the system is reduced.

Description

Face tracking method and device in video, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and apparatus for tracking a face in a video, a computer device, and a storage medium.
Background
At present, a face recognition tracking mode usually obtains continuous videos through front-end shooting, the videos are divided into each video frame and transmitted to the background, a background server processes face images in each video frame and matches the face images in a storage library, so that the calculated amount of the background is overlarge, the load is heavy, the matching time is long, and the front end cannot timely obtain corresponding face information.
Disclosure of Invention
The application mainly aims to provide a face tracking method, a face tracking device, computer equipment and a storage medium in video, aiming at improving the matching speed of tracking faces.
In order to achieve the above object, the present application provides a face tracking method in video, comprising the following steps:
acquiring a video frame to be tracked through the front end of a system, wherein the video frame comprises at least one first face, and acquiring face images of all the first faces;
inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image and training based on a convolutional neural network;
All the first face feature information is sent to the back end of the system, and the back end of the system is used for matching the first face feature information corresponding to each first face with faces preset in a storage library of the back end of the system;
if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face;
and loading the information of the second face on a display platform in the form of an image through the rear end of the system for displaying.
Further, after the step of obtaining the information of the second face if the second face matching the first face is matched from the storage library through the back end of the system, the method further includes:
storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a memory library for deleting face data stored in the temporary library at regular time;
if the face characteristic information of the new first face is acquired from the video frame to be tracked, matching the face characteristic information with the face in the temporary library;
if the matching of the face in the temporary library fails, matching the face characteristic information of the new first face with the face in the storage library;
If the matching with the face in the storage library is successful, storing the information of the new first face which is successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
Further, after the step of matching with the face in the temporary library if the face feature information of the new first face is obtained in the video frame to be tracked, the method includes:
and if the matching with the face in the temporary library is successful, loading the information of the face matched with the new first face characteristic information on the display platform for display by using the information of the image, and updating the latest matching time of the face matched in the temporary library.
Further, the step of loading the information of the second face on a display platform for display in the form of an image through the back end of the system comprises the following steps:
judging whether the information of the second face is provided with an important mark or not, wherein the important mark comprises a face marked as preferential display or a face focused on in advance;
and if so, loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image.
Further, the step of matching, by the system back end, the first face feature information corresponding to each first face with a face preset in the system back end storage library includes:
ordering all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
copying the number of each piece of first face feature information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face feature information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face feature information of the first matrix;
subtracting the third matrix from the second matrix to obtain a fourth matrix;
carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
Comparing all the matched total values to obtain the minimum value in the matched total values;
judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
if yes, obtaining the face corresponding to the minimum value in the total matching value, and searching the face corresponding to the minimum value in the total matching value from the storage library.
Further, the step of obtaining the face corresponding to the minimum value in the total matching value and searching the face corresponding to the minimum value in the total matching value from the repository includes:
judging whether the absolute value of each face feature information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face feature preset threshold value or not;
if yes, taking the face corresponding to the minimum value in the total matching value as the face matched most.
Further, after the step of sorting all the first face feature information corresponding to any one of the first faces according to the specified order to generate the first matrix, the method includes:
identifying a gender of the first face, wherein the gender comprises male and female;
and searching the faces with the same gender as the first face from the storage library according to the gender of the first face to form the second matrix.
The application also provides a face tracking device in the video, which comprises:
the system comprises a first acquisition module, a second acquisition module and a first tracking module, wherein the first acquisition module is used for acquiring a video frame to be tracked through the front end of the system, the video frame comprises at least one first face, and face images of all the first faces are acquired;
the second acquisition module is used for inputting each first face image into a detection model preset at the front end of the system for detection, and acquiring a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image based on convolutional neural network training;
the first matching module is used for sending all the first face feature information to the system back end, and matching the first face feature information corresponding to each first face with faces preset in a system back end storage library through the system back end;
the third acquisition module is used for acquiring information of a second face matched with the first face if the second face is matched with the first face from the storage library through the rear end of the system;
and the display module is used for loading the information of the second face on a display platform in the form of an image through the rear end of the system for displaying.
The application also provides a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of any of the methods described above when the computer program is executed.
The application also provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of any of the preceding claims.
The face tracking method, the face tracking device, the computer equipment and the storage medium in the video provided by the application have the following beneficial effects:
according to the face tracking method, the device, the computer equipment and the storage medium in the video, the face characteristic information corresponding to the first face is acquired through the front end of the system, the face characteristic information is matched through the rear end of the system to acquire the corresponding second face, the information of the second face is displayed on the display platform in an image mode, the front end and the rear end of the system are utilized to operate simultaneously, the front end of the system is utilized to analyze and acquire the face characteristic information, the matching speed of the tracked face is improved, the acquired face characteristic information is not required to be attributed to the rear end operation of the system, and the rear end operation load of the system is reduced.
Drawings
FIG. 1 is a schematic diagram showing steps of a face tracking method in a video according to an embodiment of the present application;
FIG. 2 is a block diagram of a face tracking device in a video according to an embodiment of the present application;
fig. 3 is a block diagram schematically illustrating a structure of a computer device according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, in one embodiment of the present application, a face tracking method in video is provided, which includes the following steps:
step S1, acquiring a tracked video frame through the front end of a system, wherein the video frame comprises at least one first face, and acquiring face images of all the first faces;
s2, inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image and based on convolutional neural network training;
Step S3, all the first face feature information is sent to the back end of the system, and the back end of the system is used for matching the first face feature information corresponding to each first face with faces preset in a storage library of the back end of the system;
step S4, if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face;
and S5, loading the information of the second face on a display platform in the form of an image through the rear end of the system for display.
Dividing a video stream acquired in real time into video frames one by one according to the time sequence, and acquiring video frames with human faces, wherein the picture formats of the video frames can be set into JPG, PNG and BMP picture formats, all the human face images of the first human faces in the video frames are acquired through the front end of a system, each first human face image is input into a detection model for detection, wherein the detection model is obtained based on convolutional neural network training by utilizing known human face images, and distinguishing features of different parts of each human face image are acquired, so that corresponding human face feature information can be obtained according to the distinguishing features; all first face characteristic information of each first face is sent to the back end of the system, the first face characteristic information corresponding to each first face is matched with faces preset in a storage library at the back end of the system through the back end of the system, a storage library for storing a plurality of faces is preset at the back end of the system, and if a second face matched with the first face characteristic information corresponding to the first face is matched from the storage library, information of the second face is obtained, and in a specific embodiment, the information of the second face comprises names, sexes, identity cards, records from the beginning to the end, business and the like; and sending the acquired information of the second face to a display platform and displaying the information in an image form.
The convolutional neural network is one of core algorithms in the field of image recognition, and has stable performance when learning a large amount of data. For general large-scale image classification problems, convolutional neural networks can be used to construct hierarchical classifiers (hierarchical classifier) and also in fine-classification recognition (fine-grained recognition) to extract discriminating features of images for learning by other classifiers. For the latter, the feature extraction may be performed manually by inputting different portions of the image into the convolutional neural network, respectively, or may be performed by the convolutional neural network by self-extraction through unsupervised learning.
In a specific embodiment, if the second face information matching with the first face is not matched with the system back end, judging that the first face is a strange face, editing the first face into a photo, placing the photo in an electronic watch with a preset format, and sending the photo to a display platform for display, so that a background staff looking at the display platform can contact the corresponding first face in a strange state as soon as possible.
In this embodiment, the face of each acquired video frame is identified by UUID, where UUID is an abbreviation of a universal unique identifier (Universally Unique Identifier), so that all elements in the distributed system can have unique identification information, and no identification information is required to be specified by the central control end. Therefore, each person can establish UUIDs which do not conflict with other persons, specifically, the current video frame is subjected to face cutting through neural network recognition, 5 person face features are recognized in the video frame, the configuration identifiers of the 5 persons can be 1, 2, 3, 4 and 5, the configuration identifiers are independent from each other and are not repeated, and after the face features are not recognized, information of other persons is displayed. If the face identified as 1 leaves the monitoring range and reenters, the face is identified as new identification information, for example, the face is identified as 6.
In an embodiment, after the step S4 of acquiring the information of the second face if the second face matching the first face feature information is matched from the storage through the back end of the system, the method further includes:
step S41, storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a memory library for deleting the face data stored in the temporary library at regular time;
step S42, if the face characteristic information of the new first face is obtained from the video frame to be tracked, matching with the face in the temporary library;
step S43, if the matching with the face in the temporary library fails, matching the face characteristic information of the new first face with the face in the storage library;
step S44, if the matching with the face in the storage library is successful, storing the information of the new first face successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
In the above steps, the temporary repository is a temporary repository preset at the back end of the system, and may be rewritten according to the need to change the temporary repository to delete the face data stored in the temporary repository at regular time, for example, the time is set to half an hour or ten minutes, if the face data in the temporary repository exceeds the designated time, the temporary repository is automatically deleted, and in other embodiments, the earliest received second face data may be replaced by the latest received second face data.
The method comprises the steps that the back end of a system stores all second face data successfully matched in a temporary library, when face feature information of a new first face is obtained in a video frame to be tracked, the new first face feature is matched with faces in the temporary library, the tracked video frame is divided into one frame and one frame, all video frames are sequentially subjected to face acquisition, namely each frame of video is subjected to new first face acquisition, if the new first face is not matched with the faces in the temporary library, the face feature of the new first face is matched with the faces in the temporary library, and if the new first face is successfully matched with the faces in the temporary library, the information of the new first face successfully matched with the faces in the temporary library is stored in the temporary library; if the matching with the face in the storage library fails, the face is directly sent to the display platform in the form of an image for display.
In an embodiment, after step S42 of matching with the face in the temporary library when the face feature information of the new first face is acquired in the video frame to be tracked, the method includes:
step S402, if the matching with the face in the temporary library is successful, loading the information of the face matched with the new first face feature information on the display platform for display in the form of the information of the image, and updating the latest matching time of the face matched in the temporary library.
In the above steps, if the matching of the face feature information of the new first face and the face in the temporary library is successful, the face information matched with the new first face feature information is loaded on the display platform for display by the image information, the face matching is not required to be carried out in the storage library again, and the latest matching time of the matched face in the temporary library is updated, so that the time of the successfully matched face in the temporary library is prolonged, the matching of the face feature information of the next video frame face acquired by the rear end of the system and the storage library is reduced, and the time is saved.
In an embodiment, the step S5 of loading, by the back end of the system, the information of the second face in the form of an image on a display platform for display includes:
step S51, judging whether the information of the second face is a key mark, wherein the key mark comprises a face marked as preferential display or a face focused on in advance;
and step S52, if yes, loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image.
In the above steps, the above mentioned key marks include marks for preferential display, focus attention, etc. for a certain face, and the system makes a corresponding salient mark for the face of the key mark, determines whether the information of the second face carries the key mark, if yes, loads the information of the second face in the form of an image in the central area of the display platform for display. In one embodiment, the general client display interface is blue, and the key client display interface is red, and the information of the second face with the key mark is loaded in the center area of the display platform in the form of an image for displaying, so that the background staff can know the information of the face (client) with the key mark in time.
In an embodiment, the step S3 of matching, by the system back end, the first face feature information corresponding to each first face with a face preset in the system back end repository includes:
step S31, all the first face characteristic information corresponding to any one of the first faces is ordered according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
step S32, copying the number of each piece of first face feature information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face feature information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face feature information of the first matrix;
step S33, subtracting the third matrix from the second matrix to obtain a fourth matrix;
step S34, carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
step S35, adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
Step S36, comparing all the matched total values to obtain the minimum value in the matched total values;
step S37, judging whether the minimum value in the matched total value is smaller than a face threshold value;
and step S38, if yes, obtaining the face corresponding to the minimum value in the total matching value, and searching the face corresponding to the minimum value in the total matching value from the storage library.
In the above steps, all the first face feature information corresponding to the first face is ordered according to the specified sequence to generate a first matrix, if the first face feature information has M corresponding face feature information values, a group of M rows of first matrices or a group of M columns of first matrices are formed, N faces are preset and stored in a storage library, each face corresponds to M feature information, then n×m feature information is generated in total, when one face to be detected includes 512 feature information as an example, and when N faces exist, N face feature information exists, N face feature information can be generated, and N rows of 512 matrices, namely a second matrix, are generated by the N face feature information, and a multi-order matrix is generated by the plurality of face feature information. The method for generating the second matrix by using the N face feature information can refer to the method for forming the first matrix, and c++ can be used for generating the second matrix by using OpenCV, wherein OpenCV is a cross-platform computer vision library issued based on BSD permissions (open source), python (computer programming language) can be used for generating the second sub-matrix by using NumPy library, and NumPy system is an open source numerical calculation extension of python, and can be used for storing and processing large-scale matrices and can also be generated by using other algorithms, which are not described herein.
In an embodiment, taking a first face with 512 label arrays as an example, the first matrix is a matrix with 1 row and 512 columns, the second matrix is a matrix with N rows and 512 columns, and the first matrix is generated into a matrix with N rows and 512 columns, wherein each row is face characteristic information of each element of the face to be detected arranged according to a specified sequence, that is, the 1 st row and 512 columns of the first matrix are copied into a matrix with N rows and 512 columns, so as to form a third matrix.
And subtracting the third matrix from the second matrix to obtain a new fourth matrix, wherein the fourth matrix is a face characteristic difference matching value between each face, each numerical value in the fourth matrix represents a result obtained by calculating the characteristic information of the face to be detected and the characteristic information of the face in the face library corresponding to the numerical value, and the magnitude of the numerical value represents the matching degree of the face to be detected and the face corresponding to one column in the second matrix.
In the above step, absolute value operation is performed on each face feature difference matching value in the fourth matrix to obtain a fifth matrix, each value in the fifth matrix is greater than or equal to zero, absolute values of face feature difference matching values corresponding to elements of each face in the fifth matrix are added to obtain a matching total value of each face, and the matching total value is a difference total value between each face in the face library and a face to be detected. In a specific embodiment, the total matching value includes N number values, which are calculated by the feature information of the face to be detected and the N number values of the feature information of the face in the face library, a minimum value of the N number values in the total matching value is selected, whether the minimum value is greater than a preset face threshold value is judged, if the minimum value is less than the threshold value, a face corresponding to the minimum value in the total matching value is obtained, and a face corresponding to the minimum value in the total matching value is searched from the repository, in other embodiments, if the number values are all less than the threshold value, the feature information of the face to be detected may be inaccurate due to excessively low threshold value setting or due to unclear image of the face to be detected; a larger threshold value can be input, then operation matching is carried out, or an image of the face to be detected is processed, so that a clearer image is obtained; if the number of the obtained values smaller than the threshold value is small, the screening range is narrowed, and the human eyes can directly recognize the values; if it is greater than the threshold, the match fails.
In an embodiment, the step S38 of obtaining the face corresponding to the minimum value in the total matching value and searching the face corresponding to the minimum value in the total matching value from the repository includes:
step S381, determining whether an absolute value of each face feature information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face feature preset threshold;
and step 382, if yes, taking the face corresponding to the minimum value in the total matching value as the most matched face.
In the above steps, each face feature difference matching value of the fifth matrix corresponding to the minimum value in the matching total value is obtained, whether each face feature difference matching value is smaller than the corresponding face feature preset threshold value is judged, if yes, the face matched with the face feature information of the preset second matrix corresponding to the minimum value in the matching total value is used as the face matched most, and if not, the face corresponding to the minimum value in the matching total value is excluded as the face matched. In a specific embodiment, if the face feature difference matching value of the right eye size is 0.09, and in the set face feature preset threshold, the preset threshold of the right eye is 0.07, even if the matching total value is smaller than the face threshold, the face corresponding to the matching total value still cannot be used as the corresponding matching face because the feature difference matching value of the right eye does not meet the matching requirement.
In an embodiment, after the step S31 of sorting all the first face feature information corresponding to any one of the first faces in the specified order to generate a first matrix, the method includes
Step S311, identifying the sex of the first face, wherein the sex includes male and female;
step S312, according to the gender of the first face, searching the faces with the same gender as the first face from the repository to make the second matrix.
In the above steps, after the first face is obtained, the gender of the first face is identified, where the gender includes a male and a female, and according to the gender of the face to be detected, a preliminary screening is performed from the faces in the storage library, for example, in a specific embodiment, when the first face is identified as a male, the male face in the storage library is extracted, and then, according to the obtained male face, the corresponding face feature information is obtained and the second matrix is generated, so that the range can be reduced, and the time of subsequent operation is saved.
In this embodiment, the method for identifying the sex of the face image is: the method comprises the steps of obtaining gradient features of a large number of face images in advance, inputting the extracted gradient features of the face images into an SVM (support vector machine) for training, training the face images by establishing a console project and configuring an OpenCv environment to obtain corresponding gradient features, presenting the corresponding gradient features in a floating point number container, and obtaining floating point values corresponding to each element and corresponding face gender when the images of the faces to be detected are obtained.
In other embodiments, all faces in the storage library can be made into two large matrices in advance according to the gender, after the gender of the face to be detected is identified, the matrix of the corresponding gender can be directly extracted for operation, and a new face can be added in the subsequent face library and can be correspondingly placed in the corresponding gender matrix according to the gender of the face.
In other embodiments, the first face may be identified in the age hierarchy, and the face corresponding to the age hierarchy of the first face may be found in the storage library to form a preset second matrix according to the age hierarchy of the first face, so as to reduce the operation range, save operation time, and make the speed of matching the face faster.
In summary, in the face tracking method in video provided by the embodiment of the application, the face characteristic information corresponding to the first face is obtained through the front end of the system, and the face characteristic information is matched through the rear end of the system, so that the corresponding second face is obtained, the information of the second face is displayed on the display platform in an image form, the front end and the rear end of the system are utilized to operate simultaneously, the front end of the system is utilized to analyze and obtain the face characteristic information, the matching speed of tracking the face is improved, the operation of the rear end of the system is not required to be attributed to the obtained face characteristic information, and the operation load of the rear end of the system is reduced.
Referring to fig. 3, in an embodiment of the present application, there is further provided a face tracking apparatus in video, including:
the first acquiring module 10 is configured to acquire a video frame to be tracked through a front end of a system, where the video frame includes at least one first face, and acquire face images of all the first faces;
the second obtaining module 20 is configured to input each of the first face images into a detection model preset at the front end of the system for detection, and obtain a plurality of corresponding first face feature information, where the detection model is obtained by using a known face image based on convolutional neural network training;
the first matching module 30 is configured to send all the first face feature information to a system back end, and match, through the system back end, the first face feature information corresponding to each first face with a face preset in a system back end storage library;
a third obtaining module 40, configured to obtain information of a second face matching the first face if the second face is matched with the first face from the storage through the back end of the system;
and the display module 50 is used for loading the information of the second face on a display platform in the form of an image through the back end of the system for display.
In this embodiment, a video stream acquired in real time is divided into video frames one by one according to a time sequence, so as to obtain video frames with faces, wherein the picture formats of the video frames can be set into JPG, PNG and BMP picture formats, a first acquisition module 10 acquires face images of all first faces in the video frames through the front end of a system, a second acquisition module 20 inputs each first face image into a detection model for detection, wherein the detection model is obtained by using known face images based on convolutional neural network training, and different features of different parts of each face image are obtained, so that corresponding face feature information can be obtained according to the different features; the first matching module 30 sends all the first face feature information of each first face to the back end of the system, and matches the first face feature information corresponding to each first face with faces preset in a storage library at the back end of the system through the back end of the system, wherein a storage library for storing a plurality of faces is preset at the back end of the system, and the third obtaining module 40 obtains information of a second face if the second face matching with the first face feature information corresponding to the first face is matched from the storage library, and in a specific embodiment, the information of the second face comprises names, sexes, identity cards, incoming and outgoing records, business and the like; the display module 50 sends the acquired information of the second face to the display platform and displays the information in the form of an image.
The convolutional neural network is one of core algorithms in the field of image recognition, and has stable performance when learning a large amount of data. For general large-scale image classification problems, convolutional neural networks can be used to construct hierarchical classifiers (hierarchical classifier) and also in fine-classification recognition (fine-grained recognition) to extract discriminating features of images for learning by other classifiers. For the latter, the feature extraction may be performed manually by inputting different portions of the image into the convolutional neural network, respectively, or may be performed by the convolutional neural network by self-extraction through unsupervised learning.
In a specific embodiment, if the second face information matching with the first face is not matched with the system back end, judging that the first face is a strange face, editing the first face into a photo, placing the photo in an electronic watch with a preset format, and sending the photo to a display platform for display, so that a background staff looking at the display platform can contact the corresponding first face in a strange state as soon as possible.
In this embodiment, the face of each acquired video frame is identified by UUID, where UUID is an abbreviation of a universal unique identifier (Universally Unique Identifier), so that all elements in the distributed system can have unique identification information, and no identification information is required to be specified by the central control end. Therefore, each person can establish UUIDs which do not conflict with other persons, specifically, the current video frame is subjected to face cutting through neural network recognition, 5 person face features are recognized in the video frame, the configuration identifiers of the 5 persons can be 1, 2, 3, 4 and 5, the configuration identifiers are independent from each other and are not repeated, and after the face features are not recognized, information of other persons is displayed. If the face identified as 1 leaves the monitoring range and reenters, the face is identified as new identification information, for example, the face is identified as 6.
In an embodiment, the face tracking device in the video further comprises:
the first storage module is used for storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a storage library for deleting the face data stored in the temporary library at regular time;
the second matching module is used for matching with the face in the temporary library if the face characteristic information of the new first face is acquired from the video frame to be tracked;
the third matching module is used for matching the face characteristic information of the new first face with the faces in the storage library if the matching with the faces in the temporary library fails;
the second storage module is used for storing the information of the new first face successfully matched in the temporary library if the matching with the face in the storage library is successful; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
In this embodiment, the temporary repository is a temporary repository preset at the back end of the system, and may be rewritten according to a code as needed to change the temporary repository to delete the face data stored in the temporary repository at regular time, for example, the time is set to half an hour or ten minutes, if the face data in the temporary repository exceeds the specified time, the temporary repository is automatically deleted, and in other embodiments, the earliest received second face data may be replaced by the latest received second face data.
The first storage module stores all second face data successfully matched in a temporary library through the system back end, when new face feature information of a first face is obtained in a video frame to be tracked, the second matching module firstly matches the new first face feature with faces in the temporary library, wherein the tracked video frame is divided into one frame, all video frames are sequentially subjected to face acquisition, namely each frame of video is subjected to new first face acquisition, if the new first face is failed to match with the faces in the temporary library, the face feature of the new first face is matched with the faces in the temporary library, and if the new first face is successful to match with the faces in the temporary library, the second storage module stores the information of the faces of the next video frame which are successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is directly sent to the display platform in the form of an image for display.
In an embodiment, the face tracking device in the video comprises:
and the loading module is used for loading the information of the face matched with the new first face characteristic information on the display platform for display in the form of the information of the image if the face matched with the face in the temporary library is successful, and updating the latest matching time of the face matched in the temporary library.
In this embodiment, if the matching between the face feature information of the next new first face and the face in the temporary library is successful, the loading module loads the face information matched with the new first face feature information on the display platform for display as the image information, and does not need to enter the storage library again to perform face matching, and updates the latest matching time of the matched face in the temporary library, so as to prolong the time of the successfully matched face in the temporary library, reduce the matching between the face feature information of the next video frame face acquired by the rear end of the system and the storage library, and save time.
In one embodiment, the display module 50 includes:
a first judging unit, configured to judge whether the information of the second face is a key mark, where the key mark includes a face marked as preferential display or a face focused on in advance;
and the first execution unit is used for loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image if the information is the key mark.
In this embodiment, the above-mentioned key marks include marks for preferential display, focus attention, and the like for a certain face, and for the face of the key mark, a corresponding salient mark is made corresponding to the key mark, and the first determining unit determines in advance whether the information of the second face carries the key mark, and if it is determined that the information of the second face carries the key mark, the executing unit loads the information of the second face in the form of an image in the central area of the display platform for display. In one embodiment, the general client display interface is blue, and the key client display interface is red, and the information of the second face with the key mark is loaded in the center area of the display platform in the form of an image for displaying, so that the background staff can know the information of the face (client) with the key mark in time.
In one embodiment, the first matching module 30 includes:
the first generation unit is used for sequencing all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
the second generation unit is used for copying the number of each piece of first face characteristic information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face characteristic information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face characteristic information of the first matrix;
the first acquisition unit is used for subtracting the third matrix from the second matrix to obtain a fourth matrix;
the second acquisition unit is used for carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
the third acquisition unit is used for adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
A fourth obtaining unit, configured to compare all the matching total values, and obtain a minimum value in the matching total values;
the second judging unit is used for judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
and the second execution unit is used for obtaining the face corresponding to the minimum value in the total matching value if yes, and searching the face corresponding to the minimum value in the total matching value from the storage library.
In this embodiment, the first generating unit sorts all first face feature information corresponding to the first face according to a specified order to generate a first matrix, if there are M first face feature information, each corresponding to a corresponding face feature information value, a group of M first matrices of rows and M columns or a group of M first matrices of rows are formed, N faces are preset and stored in the storage library, and if each face corresponds to M feature information, there are N x M feature information in total, and when one face to be detected includes 512 feature information, there are N face feature information, the N face feature information can generate a matrix of N rows and 512 columns, namely, a second matrix, and multiple face feature information can generate a multi-order matrix. The method for generating the second matrix by using the N face feature information can refer to the method for forming the first matrix, c++ can be used for generating the second matrix through OpenCV, wherein OpenCV is a cross-platform computer vision library issued based on BSD permissions (open sources), python (computer programming language) can be used for generating the second sub-matrix through a NumPy library, and the NumPy system is an open source numerical calculation extension of python, and can be used for storing and processing large-scale matrices and can also be generated by using other algorithms, which are not described herein.
In an embodiment, taking a first face with 512 label arrays as an example, the first matrix is a matrix with 1 row and 512 columns, the second matrix is a matrix with N rows and 512 columns, and the first matrix is generated into a matrix with N rows and 512 columns, wherein each row is face characteristic information of each element of the face to be detected arranged according to a specified sequence, that is, the 1 st row and 512 columns of the first matrix are copied into a matrix with N rows and 512 columns, so as to form a third matrix.
The first acquisition unit subtracts the third matrix from the second matrix to obtain a new fourth matrix, wherein the fourth matrix is a face feature difference matching value between each face, each value in the fourth matrix represents a result obtained by calculating face feature information to be detected and feature information of the face in a face library corresponding to the value, and the size of the value represents the matching degree of the face to be detected and the face corresponding to one column in the second matrix.
In this embodiment, the second obtaining unit performs an absolute value operation on each face feature difference matching value in the fourth matrix to obtain a fifth matrix in which each value in the matrix is greater than or equal to zero, and the third obtaining unit adds the absolute values of the face feature difference matching values corresponding to each element of each face in the fifth matrix to obtain a matching total value of each face, where the matching total value is a difference total value between each face in the face library and the face to be detected. In a specific embodiment, the matching total value includes N number values, which are calculated by the face feature information to be detected and the N number of face feature information in the face library, and the fourth obtaining unit selects a minimum value of the N number values in the matching total value, the second judging unit judges whether the minimum value is greater than a preset face threshold value, if the minimum value is less than the threshold value, the second executing unit obtains a face corresponding to the minimum value in the matching total value, and searches the face corresponding to the minimum value in the matching total value from the repository, in other embodiments, if the number values are all less than the threshold value, the feature information of the face to be detected may be inaccurate due to excessively low threshold value setting or unclear images of the face to be detected; a larger threshold value can be input, then operation matching is carried out, or an image of the face to be detected is processed, so that a clearer image is obtained; if the number of the obtained values smaller than the threshold value is small, the screening range is narrowed, and the human eyes can directly recognize the values; if it is greater than the threshold, the match fails.
In one embodiment, the second execution unit includes:
the judging subunit is used for judging whether the absolute value of each face characteristic information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face characteristic preset threshold value or not;
and the execution subunit is used for taking the face corresponding to the minimum value in the total matching value as the most matched face if the total matching value is the same.
In this embodiment, each face feature difference matching value of the fifth matrix corresponding to the minimum value in the total matching value is obtained, whether each face feature difference matching value is smaller than a corresponding face feature preset threshold value is determined, if yes, a face matched with face feature information of the preset second matrix corresponding to the minimum value in the total matching value is taken as a best matching face, and if not, a face corresponding to the minimum value in the total matching value is excluded as a matching face. In a specific embodiment, if the face feature difference matching value of the right eye size is 0.09, and in the set face feature preset threshold, the preset threshold of the right eye is 0.07, even if the matching total value is smaller than the face threshold, the face corresponding to the matching total value still cannot be used as the corresponding matching face because the feature difference matching value of the right eye does not meet the matching requirement.
In one embodiment, the first matching module 30 includes:
an identification unit configured to identify a sex of the first face, wherein the sex includes a male and a female;
and the searching unit is used for searching the face with the same sex as the first face from the storage library to make the second matrix according to the sex of the first face.
In this embodiment, after the first face is obtained, the identification unit identifies the gender of the first face, where the gender includes a male and a female, and performs preliminary screening from the faces in the storage library according to the gender of the face to be detected, for example, in a specific embodiment, when the identification unit identifies that the first face is a male, the search unit extracts the male face in the storage library, and then obtains corresponding face feature information according to the obtained male face and generates the second matrix, so that the range can be reduced, and the time of subsequent operation is saved.
In this embodiment, the method for identifying the sex of the face image is: the method comprises the steps of obtaining gradient features of a large number of face images in advance, inputting the extracted gradient features of the face images into an SVM (support vector machine) for training, training the face images by establishing a console project and configuring an OpenCv environment to obtain corresponding gradient features, presenting the corresponding gradient features in a floating point number container, and obtaining floating point values corresponding to each element and corresponding face gender when the images of the faces to be detected are obtained.
In other embodiments, all faces in the storage library can be made into two large matrices in advance according to the gender, after the gender of the face to be detected is identified, the matrix of the corresponding gender can be directly extracted for operation, and a new face can be added in the subsequent face library and can be correspondingly placed in the corresponding gender matrix according to the gender of the face.
In other embodiments, the first face may be identified in the age hierarchy, and the face corresponding to the age hierarchy of the first face may be found in the storage library to form a preset second matrix according to the age hierarchy of the first face, so as to reduce the operation range, save operation time, and make the speed of matching the face faster.
In summary, in the face tracking device in video provided in the embodiment of the present application, the front end of the system obtains the face feature information corresponding to the first face, and matches the face feature information with the rear end of the system to obtain the corresponding second face, the information of the second face is displayed on the display platform in the form of an image, the front end and the rear end of the system are used to operate simultaneously, and the front end of the system is used to analyze and obtain the feature information of the face, so that the matching speed of tracking the face is improved, and the feature information of the obtained face does not need to be attributed to the operation of the rear end of the system, so that the operation load of the rear end of the system is reduced.
Referring to fig. 3, in an embodiment of the present application, there is further provided a computer device, which may be a server, and an internal structure thereof may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data such as the second face. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a face tracking method in video.
The processor executes the steps of the face tracking method in the video:
acquiring a video frame to be tracked through the front end of a system, wherein the video frame comprises at least one first face, and acquiring face images of all the first faces;
Inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image and training based on a convolutional neural network;
all the first face feature information is sent to the back end of the system, and the back end of the system is used for matching the first face feature information corresponding to each first face with faces preset in a storage library of the back end of the system;
if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face;
and loading the information of the second face on a display platform in the form of an image through the rear end of the system for displaying.
In an embodiment, if the processor matches a second face matching the first face from the storage through the system back end, the step of acquiring information of the second face further includes:
storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a memory library for deleting face data stored in the temporary library at regular time;
If the face characteristic information of the new first face is acquired from the video frame to be tracked, matching the face characteristic information with the face in the temporary library;
if the matching of the face in the temporary library fails, matching the face characteristic information of the new first face with the face in the storage library;
if the matching with the face in the storage library is successful, storing the information of the new first face which is successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
In an embodiment, after the step of matching the face in the temporary library when the processor obtains the face feature information of the new first face in the video frame to be tracked, the method includes:
and if the matching with the face in the temporary library is successful, loading the information of the face matched with the new first face characteristic information on the display platform for display by using the information of the image, and updating the latest matching time of the face matched in the temporary library.
In one embodiment, the step of loading, by the processor, the information of the second face on a display platform in the form of an image through the system back end includes:
Judging whether the information of the second face is provided with an important mark or not, wherein the important mark comprises a face marked as preferential display or a face focused on in advance;
and if so, loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image.
In an embodiment, the step of matching, by the processor, the first face feature information corresponding to each first face with a face preset in the system back-end repository includes:
ordering all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
copying the number of each piece of first face feature information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face feature information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face feature information of the first matrix;
Subtracting the third matrix from the second matrix to obtain a fourth matrix;
carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
comparing all the matched total values to obtain the minimum value in the matched total values;
judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
if yes, obtaining the face corresponding to the minimum value in the total matching value, and searching the face corresponding to the minimum value in the total matching value from the storage library.
In an embodiment, the step of obtaining, by the processor, a face corresponding to a minimum value in the total matching values, and searching, from the repository, the face corresponding to the minimum value in the total matching values includes:
judging whether the absolute value of each face feature information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face feature preset threshold value or not;
if yes, taking the face corresponding to the minimum value in the total matching value as the face matched most.
In an embodiment, after the step of generating the first matrix by the processor by sorting all the first face feature information corresponding to any one of the first faces in a specified order, the method includes:
identifying a gender of the first face, wherein the gender comprises male and female;
and searching the faces with the same gender as the first face from the storage library according to the gender of the first face to form the second matrix.
It will be appreciated by those skilled in the art that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present inventive arrangements and is not intended to limit the computer devices to which the present inventive arrangements are applicable.
An embodiment of the present application further provides a computer storage medium, on which a computer program is stored, where the computer program when executed by a processor implements a face tracking method in a video, specifically:
acquiring a video frame to be tracked through the front end of a system, wherein the video frame comprises at least one first face, and acquiring face images of all the first faces;
inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image and training based on a convolutional neural network;
All the first face feature information is sent to the back end of the system, and the back end of the system is used for matching the first face feature information corresponding to each first face with faces preset in a storage library of the back end of the system;
if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face;
and loading the information of the second face on a display platform in the form of an image through the rear end of the system for displaying.
In an embodiment, if the processor matches a second face matching the first face from the storage through the system back end, the step of acquiring information of the second face further includes:
storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a memory library for deleting face data stored in the temporary library at regular time;
if the face characteristic information of the new first face is acquired from the video frame to be tracked, matching the face characteristic information with the face in the temporary library;
if the matching of the face in the temporary library fails, matching the face characteristic information of the new first face with the face in the storage library;
If the matching with the face in the storage library is successful, storing the information of the new first face which is successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
In an embodiment, after the step of matching the face in the temporary library when the processor obtains the face feature information of the new first face in the video frame to be tracked, the method includes:
and if the matching with the face in the temporary library is successful, loading the information of the face matched with the new first face characteristic information on the display platform for display by using the information of the image, and updating the latest matching time of the face matched in the temporary library.
In one embodiment, the step of loading, by the processor, the information of the second face on a display platform in the form of an image through the system back end includes:
judging whether the information of the second face is provided with an important mark or not, wherein the important mark comprises a face marked as preferential display or a face focused on in advance;
and if so, loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image.
In an embodiment, the step of matching, by the processor, the first face feature information corresponding to each first face with a face preset in the system back-end repository includes:
ordering all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
copying the number of each piece of first face feature information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face feature information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face feature information of the first matrix;
subtracting the third matrix from the second matrix to obtain a fourth matrix;
carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
Comparing all the matched total values to obtain the minimum value in the matched total values;
judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
if yes, obtaining the face corresponding to the minimum value in the total matching value, and searching the face corresponding to the minimum value in the total matching value from the storage library.
In an embodiment, the step of obtaining, by the processor, a face corresponding to a minimum value in the total matching values, and searching, from the repository, the face corresponding to the minimum value in the total matching values includes:
judging whether the absolute value of each face feature information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face feature preset threshold value or not;
if yes, taking the face corresponding to the minimum value in the total matching value as the face matched most.
In an embodiment, after the step of generating the first matrix by the processor by sorting all the first face feature information corresponding to any one of the first faces in a specified order, the method includes:
identifying a gender of the first face, wherein the gender comprises male and female;
and searching the faces with the same gender as the first face from the storage library according to the gender of the first face to form the second matrix.
In summary, in the face tracking method, the device, the computer equipment and the storage medium in the video provided by the embodiment of the application, the face characteristic information corresponding to the first face is obtained through the front end of the system, the face characteristic information is matched through the rear end of the system to obtain the corresponding second face, the information of the second face is displayed on the display platform in the form of an image, the front end and the rear end of the system are utilized to operate simultaneously, the front end of the system is utilized to analyze and obtain the face characteristic information, the matching speed of the tracked face is improved, the acquired face characteristic information is not required to be attributed to the rear end operation of the system, and the rear end operation load of the system is reduced.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by hardware associated with a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present application and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application or direct or indirect application in other related technical fields are included in the scope of the present application.

Claims (9)

1. The face tracking method in the video is characterized by comprising the following steps of:
acquiring a video frame to be tracked through the front end of a system, wherein the video frame comprises at least one first face, and acquiring face images of all the first faces;
Inputting each first face image into a detection model preset at the front end of the system for detection, and obtaining a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image and training based on a convolutional neural network;
all the first face feature information is sent to the back end of the system, and the back end of the system is used for matching the first face feature information corresponding to each first face with faces preset in a storage library of the back end of the system;
if the second face matched with the first face is matched from the storage library through the rear end of the system, acquiring information of the second face;
loading the information of the second face on a display platform in the form of an image through the rear end of the system for display;
the step of matching, by the system back end, the first face feature information corresponding to each first face with faces preset in a repository at the system back end includes:
ordering all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
Copying the number of each piece of first face feature information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face feature information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face feature information of the first matrix;
subtracting the third matrix from the second matrix to obtain a fourth matrix, wherein the fourth matrix is a face characteristic difference matching value between each face;
carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
comparing all the matched total values to obtain the minimum value in the matched total values;
judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
if yes, obtaining the face corresponding to the minimum value in the total matching value, and searching the face corresponding to the minimum value in the total matching value from the storage library.
2. The method according to claim 1, wherein after the step of acquiring information of a second face matching the first face if the second face is matched from the storage through the system back end, the method further comprises:
storing all the matched second faces in a preset temporary library through the rear end of the system, wherein the temporary library is a memory library for deleting face data stored in the temporary library at regular time;
if the face characteristic information of the new first face is acquired from the video frame to be tracked, matching the face characteristic information with the face in the temporary library;
if the matching of the face in the temporary library fails, matching the face characteristic information of the new first face with the face in the storage library;
if the matching with the face in the storage library is successful, storing the information of the new first face which is successfully matched in the temporary library; if the matching with the face in the storage library fails, the face is sent to the display platform in the form of an image to be displayed.
3. The method for tracking a face in a video according to claim 2, wherein after the step of matching the face in the temporary library if the face feature information of the new first face is acquired in the video frame to be tracked, the method comprises:
And if the matching with the face in the temporary library is successful, loading the information of the face matched with the new first face characteristic information on the display platform for display by using the information of the image, and updating the latest matching time of the face matched in the temporary library.
4. The method of face tracking in video of claim 1, wherein the step of loading the information of the second face in the form of an image on a display platform for display through the system back end comprises:
judging whether the information of the second face is provided with an important mark or not, wherein the important mark comprises a face marked as preferential display or a face focused on in advance;
and if so, loading the information of the second face provided with the key mark in the center area of the display platform in the form of an image.
5. The method according to claim 1, wherein the step of obtaining a face corresponding to a minimum value in the matching total values and searching the memory bank for a face corresponding to the minimum value in the matching total values includes:
judging whether the absolute value of each face feature information of the fifth matrix corresponding to the minimum value in the total matching value is smaller than a corresponding face feature preset threshold value or not;
If yes, taking the face corresponding to the minimum value in the total matching value as the face matched most.
6. The method for tracking a face in a video according to claim 1, wherein after the step of sorting all the first face feature information corresponding to any one of the first faces in a specified order to generate a first matrix, the method comprises:
identifying a gender of the first face, wherein the gender comprises male and female;
and searching the faces with the same gender as the first face from the storage library according to the gender of the first face to form the second matrix.
7. A face tracking apparatus in video, comprising:
the system comprises a first acquisition module, a second acquisition module and a first tracking module, wherein the first acquisition module is used for acquiring a video frame to be tracked through the front end of the system, the video frame comprises at least one first face, and face images of all the first faces are acquired;
the second acquisition module is used for inputting each first face image into a detection model preset at the front end of the system for detection, and acquiring a plurality of corresponding first face characteristic information, wherein the detection model is obtained by utilizing a known face image based on convolutional neural network training;
The first matching module is used for sending all the first face feature information to the back end of the system, and matching the first face feature information corresponding to each first face with faces preset in a storage library at the back end of the system through the back end of the system;
the third acquisition module is used for acquiring information of a second face matched with the first face if the second face is matched with the first face from the storage library through the rear end of the system;
the display module is used for loading the information of the second face on a display platform in the form of an image through the rear end of the system for display;
the first matching module includes:
the first generation unit is used for sequencing all the first face characteristic information corresponding to any one of the first faces according to a specified sequence to generate a first matrix, wherein the first matrix is a horizontal matrix or a vertical matrix;
the second generation unit is used for copying the number of each piece of first face characteristic information in the first matrix to be the same as the number of faces of a preset second matrix, and forming a third matrix according to the ordering sequence of the second matrix, wherein the second matrix is generated by ordering face characteristic information corresponding to each face in a plurality of faces in the storage library according to the ordering sequence of the first face characteristic information of the first matrix;
The first acquisition unit is used for subtracting the third matrix from the second matrix to obtain a fourth matrix, wherein the fourth matrix is a face characteristic difference matching value between each face;
the second acquisition unit is used for carrying out absolute value operation on each numerical value in the fourth matrix to obtain a fifth matrix;
the third acquisition unit is used for adding absolute values of all the values corresponding to each face of the fifth matrix to obtain a matching total value of each face;
a fourth obtaining unit, configured to compare all the matching total values, and obtain a minimum value in the matching total values;
the second judging unit is used for judging whether the minimum value in the matched total value is smaller than a face threshold value or not;
and the second execution unit is used for obtaining the face corresponding to the minimum value in the total matching value if yes, and searching the face corresponding to the minimum value in the total matching value from the storage library.
8. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, carries out the steps of the method according to any one of claims 1 to 6.
9. A computer storage medium having stored thereon a computer program, which when executed by a processor realizes the steps of the method according to any of claims 1 to 6.
CN201910537929.0A 2019-06-20 2019-06-20 Face tracking method and device in video, computer equipment and storage medium Active CN110399795B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910537929.0A CN110399795B (en) 2019-06-20 2019-06-20 Face tracking method and device in video, computer equipment and storage medium
PCT/CN2019/103541 WO2020252928A1 (en) 2019-06-20 2019-08-30 Method and apparatus for tracking human face in video, and computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910537929.0A CN110399795B (en) 2019-06-20 2019-06-20 Face tracking method and device in video, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110399795A CN110399795A (en) 2019-11-01
CN110399795B true CN110399795B (en) 2023-10-20

Family

ID=68323241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910537929.0A Active CN110399795B (en) 2019-06-20 2019-06-20 Face tracking method and device in video, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN110399795B (en)
WO (1) WO2020252928A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875515A (en) * 2017-12-11 2018-11-23 北京旷视科技有限公司 Face identification method, device, system, storage medium and capture machine

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR112013018142B1 (en) * 2011-01-18 2021-06-15 Hsni, Llc SYSTEM AND METHOD FOR DYNAMICALLY RECOGNIZING INDIVIDUAL ITEMS IN IMAGES CONTAINED IN VIDEO SOURCE CONTENT
CN205681558U (en) * 2016-06-03 2016-11-09 广东万峯信息科技有限公司 A kind of supervising device based on recognition of face
US10509952B2 (en) * 2016-08-30 2019-12-17 Irida Labs S.A. Fast, embedded, hybrid video face recognition system
CN109492560A (en) * 2018-10-26 2019-03-19 深圳力维智联技术有限公司 Facial image Feature fusion, device and storage medium based on time scale
CN109598250B (en) * 2018-12-10 2021-06-25 北京旷视科技有限公司 Feature extraction method, device, electronic equipment and computer readable medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875515A (en) * 2017-12-11 2018-11-23 北京旷视科技有限公司 Face identification method, device, system, storage medium and capture machine

Also Published As

Publication number Publication date
WO2020252928A1 (en) 2020-12-24
CN110399795A (en) 2019-11-01

Similar Documents

Publication Publication Date Title
CN110569721B (en) Recognition model training method, image recognition method, device, equipment and medium
US11450027B2 (en) Method and electronic device for processing videos
US20210382933A1 (en) Method and device for archive application, and storage medium
CN111160275B (en) Pedestrian re-recognition model training method, device, computer equipment and storage medium
CN111310706B (en) Commodity price tag identification method and device, electronic equipment and storage medium
CN110660078B (en) Object tracking method, device, computer equipment and storage medium
CN111625687B (en) Method and system for quickly searching people in media asset video library through human faces
US20160055627A1 (en) Information processing device, image processing method and medium
US11403875B2 (en) Processing method of learning face recognition by artificial intelligence module
TW201448585A (en) Real time object scanning using a mobile phone and cloud-based visual search engine
CN115240262A (en) Face recognition method, system, computer device and medium based on end-side cooperation
CN112084812A (en) Image processing method, image processing device, computer equipment and storage medium
CN116110100A (en) Face recognition method, device, computer equipment and storage medium
CN114698399A (en) Face recognition method and device and readable storage medium
CN111611944A (en) Identity recognition method and device, electronic equipment and storage medium
CN114565955A (en) Face attribute recognition model training and community personnel monitoring method, device and equipment
CN110399795B (en) Face tracking method and device in video, computer equipment and storage medium
CN111738059B (en) Face recognition method oriented to non-inductive scene
CN112926616A (en) Image matching method and device, electronic equipment and computer-readable storage medium
CN111274965A (en) Face recognition method and device, computer equipment and storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN112333182B (en) File processing method, device, server and storage medium
CN114140822A (en) Pedestrian re-identification method and device
CN107992853B (en) Human eye detection method and device, computer equipment and storage medium
CN113657180A (en) Vehicle identification method, server and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant