WO2021159672A1 - Face image recognition method and apparatus - Google Patents

Face image recognition method and apparatus Download PDF

Info

Publication number
WO2021159672A1
WO2021159672A1 PCT/CN2020/105861 CN2020105861W WO2021159672A1 WO 2021159672 A1 WO2021159672 A1 WO 2021159672A1 CN 2020105861 W CN2020105861 W CN 2020105861W WO 2021159672 A1 WO2021159672 A1 WO 2021159672A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
plug
video
image
video playback
Prior art date
Application number
PCT/CN2020/105861
Other languages
French (fr)
Chinese (zh)
Inventor
吴贞海
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021159672A1 publication Critical patent/WO2021159672A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • G06F9/44526Plug-ins; Add-ons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • This application belongs to the field of image processing technology, and in particular relates to a method and device for recognizing a face image.
  • the ORC algorithm is used to recognize the text in the picture
  • the QR code recognition algorithm is used to analyze and extract the two-dimensional code image. information.
  • face recognition technology can automatically determine the user's identity, and its application fields are becoming wider and wider. How to perform face recognition efficiently and accurately has become a problem that needs to be solved today.
  • the inventor realizes that the existing face recognition technology is mainly used for static image recognition, but it is difficult to realize face recognition in video. Especially for most video playback applications, the user needs to manually intercept the target user's location. The video image frames are handed over to the corresponding software for recognition, which increases the difficulty of face collection and the operation efficiency.
  • the embodiments of the present application provide a face image recognition method and device to solve the existing face image recognition technology, which can only perform face recognition on static images, requiring the user to manually intercept the location of the target user
  • the video image frames are handed over to the corresponding software for recognition, which increases the difficulty of face collection and the problem of operating efficiency.
  • the first aspect of the embodiments of the present application provides a face image recognition method, including:
  • the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
  • plug-in ID matches the ID of the face recognition plug-in, extract each video image frame of the video file through the video playback application after the plug-in is loaded;
  • a facial image library of the video file is established.
  • This embodiment of the application starts the video playback application by receiving the video playback instruction initiated by the user, and loads the video playback plug-in indicated by the video playback instruction to expand the functions of the video playback application. If it is detected that the loaded video playback plug-in is a human face Recognition plug-in, you can analyze the video file through the video playback application, extract each video image frame, and import each video image frame into the face recognition plug-in, so as to recognize the face image contained in each video image frame through the face recognition plug-in , And build a face image library associated with video files based on all face images, thereby realizing face recognition for dynamic video files.
  • this application does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition.
  • the face recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the efficiency of face image recognition and reduces user operations.
  • the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
  • FIG. 1 is an implementation flowchart of a face image recognition method provided by the first embodiment of the present application
  • FIG. 2 is a schematic diagram of playing a video file provided by an embodiment of the present application.
  • FIG. 3 is a specific implementation flowchart of a face image recognition method provided by the second embodiment of the present application.
  • FIG. 4 is a specific implementation flow chart of a face image recognition method S105 provided by the third embodiment of the present application.
  • FIG. 5 is a specific implementation flowchart of a face image recognition method S1051 provided by the fourth embodiment of the present application.
  • FIG. 6 is a specific implementation flowchart of a face image recognition method S105 provided by the fifth embodiment of the present application.
  • FIG. 7 is a specific implementation flowchart of a face image recognition method S104 provided by the sixth embodiment of the present application.
  • FIG. 8 is a specific implementation flowchart of a face image recognition method S102 provided by the seventh embodiment of the present application.
  • FIG. 9 is a structural block diagram of a face image recognition device provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a terminal device provided by another embodiment of the present application.
  • the execution subject of the process is the terminal device.
  • the terminal equipment includes, but is not limited to: servers, computers, smart phones, and tablet computers that can perform facial image recognition tasks.
  • Fig. 1 shows a flow chart of the method for recognizing a face image provided by the first embodiment of the present application, and the details are as follows:
  • a video play instruction is received; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played.
  • the user can send a video play instruction to the terminal device.
  • the user can trigger the video playback instruction locally on the terminal device through the interactive module configured by the terminal device, such as a keyboard, mouse, or touch screen; of course, the user can also generate a video playback instruction on the local user terminal and establish The communication link between the user terminal and the terminal device sends video playback instructions to the terminal device through the communication link, that is, the user terminal is equivalent to a remote control device that can control the terminal device to perform video playback operations.
  • the user when the user initiates a video playback operation, he can select the video playback plug-in to be called from the loadable plug-in list of the terminal device, and select one or more from the list of loadable plug-ins by clicking or checking. After selecting a video playback plug-in, click the play button. At this time, the terminal device will recognize that the user has selected the video playback plug-in, and add the plug-in identifier of the video playback plug-in selected by the user to the video playback instruction to trigger the video playback operation.
  • the terminal device can be configured with a default configuration mode, that is, when the terminal device performs a video playback operation, it can load one or more video playback plug-ins by default, without the need for the user to re-select plug-ins for each playback operation.
  • the default configuration mode of the terminal device is to load the frame rate optimization plug-in and the face recognition plug-in by default.
  • the plug-in identification of the two plug-ins Add to the video playback instruction, and generate the video playback instruction.
  • the video playback plug-in to be loaded in the default configuration mode can be set by default by the system, or can be manually configured by the user.
  • the terminal device can count the use times of each video playback plug-in, and if it detects that the use times of a certain video playback plug-in is greater than a preset use threshold, it prompts the user whether to add the video playback plug-in to the default configuration mode, if After receiving the user's feedback to agree to the addition instruction, the video playback plug-in whose use times are greater than the use threshold is added to the default configuration mode, so that the video playback plug-in is automatically loaded in subsequent playback operations.
  • the terminal device is installed with a VLC video playback application.
  • the VLC video playback application is specifically an application with a core framework to perform video playback functions.
  • the application can add multiple video playback-based plug-ins based on user needs, such as video optimization. Plug-in, video recording plug-in, and face recognition plug-in that needs to be called in this embodiment.
  • the video playback instruction can not only specify the video file to be played, but also carry the plug-in identifier that needs to be loaded.
  • the video playback instruction can be: vlc.exe--video-filter all, facereader test.mp4, where vlc.exe is the video playback application that needs to be started, and facereader is the plug-in identifier; test.mp4 It is the file ID of the video file to be played.
  • a video playback application is started, and the video playback plug-in is loaded to the video playback application based on the plug-in identifier.
  • the terminal device can start the video playback application associated with the instruction, and the terminal device will also parse the video playback instruction, extract the plug-in ID corresponding to the video playback plug-in, and query each plug-in ID The corresponding video playback plug-in loads the video playback plug-in to the video playback application to expand the functions of the video playback application.
  • the video playback application can be associated with a list of loadable plug-ins.
  • Each video playback plug-in can store the startup declaration file in the installation address of the video playback application. When the video playback application starts, it will detect the storage in the installation location. Some startup declaration files are generated to generate a list of loadable plug-ins for the video playback application.
  • the terminal device will detect whether the plug-in identifier in the video playback instruction is in the list of loadable plug-ins, if so, query the installation address of the video playback plug-in, and create a new thread in the process of the video playback application, and run the video playback through this thread
  • the running file of the plug-in to load the video playback plug-in to the video playback application if it is detected that the plug-in ID is not in the list of loadable plug-ins, the plug-in non-existence information is output.
  • a plug-in download request may be generated based on the plug-in identifier, and the plug-in download request is sent to the server corresponding to the video playback application.
  • the plug-in running file corresponding to the plug-in ID from the server, and after the download is completed, add the plug-in ID to the list of loadable plug-ins to load the video playback plug-in to the video playback application.
  • each video image frame of the video file is extracted by the video playback application after the plug-in is loaded.
  • the terminal device parses the video playback instruction, determines the file identifier of the video file to be played, and obtains the video file based on the file identifier, imports the video file into the video playback application, and outputs the video file through the video playback application.
  • the video playback application When the video playback application outputs a video file, it reads each video image frame contained in the video file, and based on the frame number of each video image frame, outputs each video playback frame in turn at the preset video playback frame rate, for example, the video The playback frame rate can be 60dps, that is, 60 video image frames are output per second.
  • the terminal device can create multiple parallel processing threads in the process of the video playback application, and run the corresponding video playback plug-ins through different parallel processing threads.
  • Video processing operations, and different video playback plug-ins have different types of data that need to be input.
  • the input data type is audio signal
  • the input data type is video Image frame. Therefore, the terminal device needs to determine the currently loaded video playback plug-in before playing the video file, and extract the corresponding data from the video file based on the type of data input required by each different video playback plug-in, and process the data through the corresponding concurrent thread. data.
  • the terminal device detects that the video playback plug-in that needs to be loaded contains a face recognition plug-in, that is, the plug-in ID matches the ID of the face recognition plug-in, it needs to obtain the input data of the face recognition plug-in, because the face recognition plug-in is for each video image Frame recognition, that is, the input data type is a video image frame, so when the video application plays the video file, each video image frame can be extracted in sequence according to the frame number and imported into the face recognition plug-in for face recognition operation.
  • the face recognition plug-in is called to extract the face images contained in each of the video image frames.
  • the video image frames can be output to the image processing unit GPU for display output flow.
  • the video image frames can be imported into the face recognition plug-in, the face recognition plug-in
  • the input video image frame will be analyzed, and the face image contained in each video image will be extracted through the built-in face recognition algorithm. Since there can be multiple objects to be photographed, the number of face images contained in one video image frame can be multiple.
  • the way for the face recognition plug-in to obtain the face image may be: use the built-in multi-size face template to slide the frame in the video image frame, and calculate the match between the framed area image and the face template If it is detected that the matching degree between the two is greater than the preset matching threshold, it is recognized that the current framed area image contains a human face, and the area image is recognized as a human face image.
  • the terminal device may first perform face recognition, and after determining the face image contained in the video image frame, perform the video playback operation.
  • the terminal device can mark the recognized face image in the video image frame, for example, a rectangular frame is used to mark the face area, and the terminal device outputs the marked video image frame, so that the user can quickly determine the video file Face image.
  • FIG. 2 shows a schematic diagram of playing a video file provided by an embodiment of the present application.
  • a face image library of the video file is established according to the entity user corresponding to each of the face images.
  • the terminal device after acquiring the face image of each video image frame, the terminal device can determine the entity user to which each face image belongs, and classify the face image based on the entity user to which it belongs, and create the person who created the video file. Face image library.
  • the terminal device can calculate the similarity between any two face images, for example, convert each face image into a face feature vector, and calculate the Euclidean distance between the two face feature vectors, Taking the reciprocal of the Euclidean distance as the similarity between the two, multiple facial region images with greater similarity are recognized as the same entity user, and facial images belonging to different entity users are distinguished, thus realizing the classification of facial images, and Mark a user ID for the face area image belonging to the same entity user.
  • the face image recognition method receives a user-initiated video playback instruction, starts a video playback application, and loads the video playback plug-in indicated by the video playback instruction to apply to the video playback If it is detected that the loaded video playback plug-in is a face recognition plug-in, you can parse the video file through the video playback application, extract each video image frame, and import each video image frame into the face recognition plug-in.
  • the face recognition plug-in recognizes the face images contained in each video image frame, and builds a face image library associated with the video file based on all face images, thereby realizing face recognition for dynamic video files.
  • this application does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition.
  • the face recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the efficiency of face image recognition and reduces user operations.
  • the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
  • Fig. 3 shows a specific implementation flowchart of a face image recognition method provided by the second embodiment of the present application.
  • the video playback application is started, and the video playback plug-in is loaded to the Before the video playback application, it also includes: S301 ⁇ S303, the details are as follows:
  • the method further includes:
  • the terminal device can obtain the plug-in data package of the face recognition plug-in through a mobile storage device or a network download.
  • the terminal device may detect the completeness of the obtained plug-in data package, for example, by extracting the CRC check code of the plug-in data package to determine whether the plug-in data package is complete.
  • the terminal device can run the above plug-in data package and import the preset test image into the running process to obtain the output result. If the position of the face image marked in the output result is consistent with the preset standard If the coordinates match, the plug-in data package is recognized as a complete data package, and the operation of S302 is executed; otherwise, if the face image contained in the test image cannot be run or cannot be recognized, the plug-in data package is recognized as an abnormal data package, and the face is retrieved again Identify the plug-in data package of the plug-in.
  • a version verification request is sent to the server, and a legal verification result fed back by the server based on the version verification request is received; the version verification request includes the version identification of the plug-in data package.
  • the terminal device may extract the version identifier carried in the plug-in data packet, generate a version verification request carrying the version identifier, and send the version verification request to the server corresponding to the video playback application. Since the face recognition plug-in needs to be loaded into the video playback application, the video playback application needs to be compatible with the face recognition plug-in. If the server detects that the face recognition plug-in is compatible with the video playback application, it can return a successful legal verification result to the terminal device; otherwise, if the server detects that the video playback plug-in is not compatible with the video playback application, it returns The verification is the result of the legal verification.
  • the terminal device may extract the download link carried in the legal verification result, and retrieve the plug-in data package of the face recognition plug-in through the download link.
  • the server detects that the current face recognition plug-in is incompatible with the video playback application, a download link of a compatible face playback plug-in can be provided, so that the terminal device can obtain a legal face recognition plug-in through the download link.
  • the terminal device determines that the plug-in data package is a legal data package, that is, the legal verification result is a successful verification, it extracts the call declaration file contained in the plug-in data package and queries the installation location of the video playback application. Add the extracted call declaration file to the file directory corresponding to the installation location, because the video playback application will detect the declaration file contained in the file directory corresponding to the installation location during the startup process, and generate a callable based on the included declaration file Plug-in list, through the above operations, the video playback application can call the face recognition plug-in in subsequent playback operations.
  • the video playback application is a VLC video playback application
  • the installation location of the VLC video playback application is //plugins, that is, the call statement file of the face recognition plug-in is added to the directory corresponding to the installation location, that is, the FaceReader plug-in Place the libfacereader_plugin.dll file in the plugins directory of the vlc player installation directory.
  • FIG. 4 shows a specific implementation flowchart of a face image recognition method S105 provided by the third embodiment of the present application.
  • S105 includes: S1051 to S1054, and the details are as follows:
  • the establishment of the face image database of the video file according to the entity user corresponding to each of the face images includes:
  • the terminal device after the terminal device extracts the face images contained in each video image frame, it can perform similarity calculation operations on the face images selected from any two video image frames.
  • the first video image frame contains Face image A, face image B, and face image C
  • the second video image frame contains face image A', face image B', and face image C'
  • the terminal device can retrieve the image from the first video image frame
  • the face image A is extracted, the face image C'is extracted from the second video image frame, and the similarity between the face image A and the face image C'is calculated.
  • the calculation of similarity can refer to the description of S105, that is, the face image is converted into a face feature vector, and the Euclidean distance between the face feature vectors located on two video image frames is calculated, and the inverse of the Euclidean distance is calculated. As the phase velocity between the above two face images.
  • the terminal device calculates the similarity
  • it selects face images in adjacent video image frames for comparison, and calculates the difference between the center coordinates of each face image, and calculates the difference between the center coordinates.
  • Face images whose values are less than the preset distance threshold are recognized as target images, and the similarity between the target images is calculated. Since the two video image frames are adjacent, the distance of the face movement is small. The face image with the smaller face movement distance can be selected as the target image, and the similarity between the target images can be calculated, thereby reducing a large number of invalid
  • the calculation operation of face similarity improves the construction speed of the face database.
  • the terminal device is set with an association threshold. If it is detected that the similarity between the two face images is less than the preset association threshold, it is determined that the two face images belong to different entity users, and there is no need to establish both. Conversely, if it is detected that the similarity between the two face images is greater than the preset association threshold, it is determined that the two face images belong to the same entity user, and the two face images are identified as related images. , And establish an association relationship between the above two face images, so as to determine all face images belonging to the same entity user according to the association relationship.
  • the terminal device can divide the face images according to the association relationship, and divide all mutually related images into a user face group. If the face image is a related image, it means that the two related images belong to the same entity user, which realizes the grouping of face images based on the entity user, and configures the user ID for all the divided face groups.
  • the user ID can be the user serial number , Where the user serial number can be determined according to the sequence of the appearance time of the earliest face image in the user's face group.
  • the terminal device may store the standard face image of the candidate user, and the terminal device may match the standard face image with the face image in the face group of each user, and identify according to the matching result
  • the user face group associated with the candidate user, and the user identification of the candidate user is used as the user identification of the user face group.
  • the face image database is established according to the user face group and the user identifier.
  • the terminal device after the terminal device has identified all user face groups and configured a user identity for each user face group, it can construct a face image library about video files.
  • the related images are recognized, and based on the association relationship between the related images, they are divided into user face groups belonging to the same entity user, thereby realizing the face image
  • the classification improves the management efficiency of face images.
  • Fig. 5 shows a specific implementation flowchart of a face image recognition method S1051 provided by the fourth embodiment of the present application.
  • a face image recognition method S1051 provided in this embodiment includes: S501 to S505, which are detailed as follows:
  • the calculating the similarity between the face images in any two video image frames includes:
  • the terminal device can configure a face key feature list according to the face features to be located.
  • the face key feature list can include four facial features: eyes, ears, mouth, and nose. Including eyebrows, forehead, etc., the specific facial features included can be configured according to different recognition accuracy.
  • the terminal device can mark each key feature of the face in the face image according to the list of key features of the face, and obtain the feature coordinates according to the coordinates of each key feature of the face in the face image.
  • the feature coordinate sequence of the face image is constructed according to the feature coordinates of the key features of all faces in the list of key features of the face.
  • the terminal device can be configured with a sequence template, which specifies the position of each key feature of the face in the sequence template. According to the feature coordinates of each key feature of the face in the face image, they are sequentially imported into the sequence template for association The position of the face image corresponding to the feature coordinate sequence is generated.
  • the terminal device can calculate the feature distance value between the two feature coordinate sequences through the Euclidean distance calculation formula and the coordinate distance calculation formula.
  • the feature distance value and the number of interval image frames are imported into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; the similarity
  • the specific degree calculation model is:
  • Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient
  • the normalization process can be performed through the number of interval image frames, thereby reducing the influence of the difference between the number of frames on the similarity calculation.
  • the difference between the feature distance value and the standard distance value the difference between the feature distance is judged, and the similarity between the two face images is calculated, and the similarity calculation is performed through the face feature coordinates.
  • the facial feature coordinates by identifying the facial feature coordinates, constructing a facial feature sequence, and calculating the facial feature based on the number of interval image frames between two video image frames and the feature distance value between the facial feature sequences.
  • the similarity of the image improves the accuracy of the similarity calculation.
  • FIG. 6 shows a specific implementation flowchart of a face image recognition method S105 provided by the fifth embodiment of the present application.
  • a face image recognition method S105 provided in this embodiment includes: S601 to S603, which are detailed as follows:
  • displaying the web page through the page data includes:
  • the expression type of the face image is determined, and the expression type is recognized as a reference expression.
  • the terminal device may use an expression recognition algorithm to determine the type of expression corresponding to the moment of shooting in the face image.
  • the expression type may be: smiling type, laughing type, crying type, sad type, etc., and the expression type of the face image is recognized as a reference expression.
  • a derivative image of the face image is output; the expression type of the derivative image is different from the expression type of the face image.
  • the terminal device can adjust the expression conversion algorithm according to the reference expression to determine how to convert from the reference expression to other expressions, for example, from a smiling expression to a crying expression, and from a smiling expression to a laughing expression.
  • the terminal device can import the face image into the expression conversion algorithm adjusted according to the reference expression, and can output derivative images corresponding to different tag types.
  • the face image library is generated according to the face image and the derivative image.
  • the face image library can be established, so that the content of the face image library can be expanded.
  • multiple derivative images with different tags are obtained based on the face image, so that the richness of the face image library can be improved.
  • FIG. 7 shows a specific implementation flowchart of a face image recognition method S104 provided by the sixth embodiment of the present application.
  • S104 in a face image recognition method provided in this embodiment includes: S1041 to S1043, which are detailed as follows:
  • the invoking the face recognition plug-in to extract the face images contained in each of the video image frames includes:
  • an image matrix of the video image frame is generated.
  • the terminal device can preprocess the video image frame before importing it into the face recognition plug-in.
  • the specific operation is based on the video image frame on the three RGB channels. Perform data fusion on the image data to generate an image matrix about the video image frame.
  • S1042 configure a convolution kernel corresponding to the video image frame according to the matrix size of the image matrix, and perform a convolution operation on the image matrix through the convolution kernel to obtain a standard matrix.
  • the terminal device after the terminal device generates the image matrix of the video image frame, it can identify the matrix size of the image matrix, and determine the convolution kernel used for the convolution operation based on the matrix size and the matrix size of the standard matrix. And through the convolution kernel, the image matrix is convolved, so that the size of the image matrix is adjusted to the standard size, and the standard matrix is obtained.
  • the standard matrix is imported into the face recognition algorithm of the face recognition plug-in, and the face image is output.
  • the standard matrix is imported into the face recognition plug-in, the standard matrix is analyzed by the face recognition algorithm in the face recognition plug-in, and the face image is output.
  • the standard matrix is obtained by preprocessing the video image frame, and the standard matrix is imported into the face recognition plug-in to output the face image, thereby improving the processing efficiency of the face recognition plug-in on the video image frame and The accuracy of face recognition.
  • FIG. 8 shows a specific implementation flowchart of a face image recognition method S102 provided by the seventh embodiment of the present application.
  • S102 in a face image recognition method provided in this embodiment includes: S1021 to S1022, and the details are as follows:
  • the terminal device starts the video playback application, and when the video playback plug-in needs to be loaded, the installation location of the video playback plug-in needs to be determined to allow the corresponding plug-in file. Therefore, the plug-in identification can be queried according to the playback plug-in addressing table.
  • the associated storage address which is the installation address of the video playback plug-in.
  • the plug-in file of the video playback plug-in is acquired from the installation address, and the plug-in file is run through the video playback application to load the video playback plug-in to the video playback application.
  • the terminal device extracts the plug-in file of the video playback plug-in from the installation address, runs the plug-in file through the video playback application, that is, creates a new concurrent thread under the process corresponding to the video playback application, and executes the plug-in file through the concurrent thread. Load the video playback plug-in to the video playback application.
  • the installation location of the video playback application is determined by the playback plug-in addressing table, so that the video playback plug-in can be loaded into the video playback application, and the efficiency of the loading operation is improved.
  • FIG. 9 shows a structural block diagram of a facial image recognition device provided by an embodiment of the present application, and each unit included in the facial image recognition device is used to execute each step in the embodiment corresponding to FIG. 1.
  • each unit included in the facial image recognition device is used to execute each step in the embodiment corresponding to FIG. 1.
  • only the parts related to this embodiment are shown.
  • the facial image recognition device includes:
  • the video play instruction receiving unit 91 is configured to receive a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
  • the video playback application starting unit 92 is configured to start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
  • the video image frame extraction unit 93 is configured to extract each video image frame of the video file through the video playback application after the plug-in is loaded if the plug-in ID matches the ID of the face recognition plug-in;
  • the face image recognition unit 94 is configured to call the face recognition plug-in to extract the face images contained in each of the video image frames;
  • the face image database establishment unit 95 is configured to establish a face image database of the video file according to the entity user corresponding to each of the face images.
  • the facial image recognition device further includes:
  • a plug-in data package acquisition unit for acquiring the plug-in data package of the face recognition plug-in
  • the legal verification result receiving unit is configured to send a version verification request to the server and receive the legal verification result fed back by the server based on the version verification request; the version verification request includes the version of the plug-in data package Logo
  • the calling statement file adding unit is used to query the installation location of the video playback application if the legal verification result is successful, and add the calling statement file in the plug-in data package to the installation location association To add the face recognition plug-in to the list of callable plug-ins of the video playback application.
  • the face image database establishment unit 95 includes:
  • a similarity calculation unit configured to calculate the similarity between the face images in any two video image frames
  • An association relationship establishment unit configured to, if the similarity is greater than a preset association threshold, identify the face images located in two different video image frames as associated images, and establish two face images The relationship between
  • the user face group dividing unit is configured to divide into multiple user face groups based on the association relationship of all the face images, and configure a user identifier for each of the user face groups; the user face group All face images in the image are the associated images;
  • the first face image database establishment unit is configured to establish the face image database according to the user face group and the user identifier.
  • the similarity calculation unit includes:
  • the feature coordinate marking unit is configured to mark the feature coordinates of each key feature of the face in the face image based on a preset list of key features of the face;
  • the feature coordinate sequence generating unit is configured to construct the feature coordinate sequence of the face image according to the feature coordinates of the key features of all faces in the face key feature list;
  • a feature distance value calculation unit configured to calculate feature distance values between the feature coordinate sequences of the face images of any two video image frames
  • Interval image frame number identification unit for identifying the number of interval image frames between any two video image frames
  • a similarity conversion unit configured to import the characteristic distance value and the number of interval image frames into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames;
  • the similarity calculation model is specifically:
  • Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient.
  • the face image database establishment unit 95 includes:
  • a reference expression recognition unit configured to determine the expression type of the face image, and recognize the expression type as a reference expression
  • a derivative image output unit configured to output a derivative image of the face image according to the expression conversion algorithm and the reference expression; the expression type of the derivative image is different from the expression type of the face image;
  • the second face image database establishment unit is configured to generate the face image database according to the face image and the derivative image.
  • the face image recognition unit 94 includes:
  • An image matrix generating unit configured to generate an image matrix of the video image frame based on the image data corresponding to the RGB channels in the video image frame;
  • a standard matrix generating unit configured to configure a convolution kernel corresponding to the video image frame according to the matrix size of the image matrix, and perform a convolution operation on the image matrix through the convolution kernel to obtain a standard matrix
  • the standard matrix importing unit is configured to import the standard matrix into the face recognition algorithm of the face recognition plug-in, and output the face image.
  • the video playback application starting unit 92 includes:
  • the installation address obtaining unit is configured to query the installation address of the video playback plug-in corresponding to the plug-in identifier according to a preset playback plug-in addressing table;
  • the video playback plug-in loading unit is configured to obtain the plug-in file of the video playback plug-in from the installation address, and run the plug-in file through the video playback application to load the video playback plug-in to the video playback application.
  • the facial image recognition device provided by the embodiment of the present application also does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition.
  • the facial recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the recognition efficiency of the face image and reduces the user's operation.
  • the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
  • FIG. 10 is a schematic diagram of a terminal device provided by another embodiment of the present application.
  • the terminal device 10 of this embodiment includes: a processor 100, a memory 101, and computer-readable instructions 102 stored in the memory 101 and running on the processor 100, such as a human face image Recognition procedures.
  • the processor 100 executes the computer-readable instructions 102, the steps in the above-mentioned face image recognition method embodiments are implemented, for example, S101 to S105 shown in FIG. 1.
  • the processor 100 executes the computer-readable instructions 102
  • the functions of the units in the foregoing device embodiments for example, the functions of the modules 91 to 95 shown in FIG. 8 are realized.
  • the computer-readable instruction 102 may be divided into one or more units, and the one or more units are stored in the memory 101 and executed by the processor 100 to complete the present application .
  • the one or more units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instruction 102 in the terminal device 10.
  • the computer-readable instruction 102 can be divided into a video playback instruction receiving unit, a video playback application startup unit, a video image frame extraction unit, a face image recognition unit, and a face image library establishment unit. The specific functions of each unit are as described above. Narrated.
  • the terminal device 10 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 100 and a memory 101.
  • FIG. 10 is only an example of the terminal device 10, and does not constitute a limitation on the terminal device 10. It may include more or fewer components than shown in the figure, or a combination of certain components, or different components.
  • the terminal device may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 100 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10.
  • the memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk equipped on the terminal device 10, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device.
  • the memory 101 is used to store the computer-readable instructions and other programs and data required by the terminal device.
  • the memory 101 can also be used to temporarily store data that has been output or will be output.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • the computer-readable storage medium stores computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute the steps of the method for recognizing the face image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present application is applicable to the technical field of image processing, and provides a face image recognition method and apparatus. The method comprises: receiving a video playback instruction; starting a video playback application, and loading a video playback plugin into the video playback application on the basis of a plugin identifier; if the plugin identifier matches an identifier of a face recognition plugin, extracting each video image frame of a video file by means of the video playback application into which the plugin has been loaded; calling the face recognition plugin to extract a face image comprised in each video image frame; and establishing a face image library of the video file according to a physical user corresponding to each face image. In the invention, the face recognition plugin is loaded into the video playback application, such that recognition is automatically performed on a face image comprised in each video image frame while the video file is being played, thereby improving recognition efficiency of a face image, reducing user operations.

Description

一种人脸图像的识别方法及设备Recognition method and equipment of face image
本申请申明享有2020年02月11日递交的申请号为202010087125.8、名称为“一种人脸图像的识别方法及设备”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application affirms that it enjoys the priority of the Chinese patent application with the application number 202010087125.8 filed on February 11, 2020, entitled "A method and equipment for facial image recognition", and the entire content of the Chinese patent application is incorporated by reference In this application.
技术领域Technical field
本申请属于图像处理技术领域,尤其涉及一种人脸图像的识别方法及设备。This application belongs to the field of image processing technology, and in particular relates to a method and device for recognizing a face image.
背景技术Background technique
随着识别技术的不断发展,越来越多的识别任务可以计算机自动执行,例如通过ORC算法识别图片中的文字,又例如通过二维码识别算法,对二维码图像进行解析提取其中携带的信息。除了上述识别技术外,人脸识别技术由于能够自动确定用户身份,应用领域也越来越广,如何能够高效准确地完成人脸识别,成为了现今亟需解决的问题。With the continuous development of recognition technology, more and more recognition tasks can be automatically performed by computers. For example, the ORC algorithm is used to recognize the text in the picture, and for example, the QR code recognition algorithm is used to analyze and extract the two-dimensional code image. information. In addition to the above-mentioned recognition technologies, face recognition technology can automatically determine the user's identity, and its application fields are becoming wider and wider. How to perform face recognition efficiently and accurately has become a problem that needs to be solved today.
发明人意识到,现有的人脸识别技术,主要是应用于静态图像识别,而对于视频中的人脸识别则较难实现,特别对于大部分的视频播放应用,需要用户手动截取目标用户所在的视频图像帧,并交由对应的软件进行识别,从而增加了人脸采集的难度以及操作效率。The inventor realizes that the existing face recognition technology is mainly used for static image recognition, but it is difficult to realize face recognition in video. Especially for most video playback applications, the user needs to manually intercept the target user's location. The video image frames are handed over to the corresponding software for recognition, which increases the difficulty of face collection and the operation efficiency.
技术问题technical problem
有鉴于此,本申请实施例提供了一种人脸图像的识别方法及设备,以解决现有的人脸图像的识别技术,只能对静态图像进行人脸识别,需要用户手动截取目标用户所在的视频图像帧,并交由对应的软件进行识别,从而增加了人脸采集的难度以及操作效率的问题。In view of this, the embodiments of the present application provide a face image recognition method and device to solve the existing face image recognition technology, which can only perform face recognition on static images, requiring the user to manually intercept the location of the target user The video image frames are handed over to the corresponding software for recognition, which increases the difficulty of face collection and the problem of operating efficiency.
技术解决方案Technical solutions
本申请实施例的第一方面提供了一种人脸图像的识别方法,包括:The first aspect of the embodiments of the present application provides a face image recognition method, including:
接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识;Receiving a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;Start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;If the plug-in ID matches the ID of the face recognition plug-in, extract each video image frame of the video file through the video playback application after the plug-in is loaded;
调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;Calling the face recognition plug-in to extract the face images contained in each of the video image frames;
根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。According to the entity user corresponding to each of the facial images, a facial image library of the video file is established.
有益效果Beneficial effect
本申请实施例通过接收用户发起的视频播放指令,启动视频播放应用,并加载该视频播放指令指示的视频播放插件,以对视频播放应用的功能进行扩展,若检测到加载的视频播放插件为人脸识别插件,则可以通过视频播放应用对视频文件进行解析,提取各个视频图像帧,并将各个视频图像帧导入到人脸识别插件,以通过人脸识别插件识别各个视频图像帧包含的人脸图像,并根据所有人脸图像构建视频文件关联的人脸图像库,从而实现了对动态的视频文件进行人脸识别。与现有的人脸图像的识别技术相比,本申请无需用户手动截取视频图像帧,并交由其他应用进行人脸图像识别,而是可以通过在视频播放应用中加载人脸识别插件,在播放视频文件的同时,自动识别每个视频图像帧所包含的人脸图像,提高了人脸图像的识别效率,减少了用户的操作。另一方面,由于人脸图像的识别过程与视频文件的播放是同步进行的,用户无需在视频文件观看后,再执行人脸图像识别,从而减少了人脸识别操作的耗时。This embodiment of the application starts the video playback application by receiving the video playback instruction initiated by the user, and loads the video playback plug-in indicated by the video playback instruction to expand the functions of the video playback application. If it is detected that the loaded video playback plug-in is a human face Recognition plug-in, you can analyze the video file through the video playback application, extract each video image frame, and import each video image frame into the face recognition plug-in, so as to recognize the face image contained in each video image frame through the face recognition plug-in , And build a face image library associated with video files based on all face images, thereby realizing face recognition for dynamic video files. Compared with the existing facial image recognition technology, this application does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition. Instead, the face recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the efficiency of face image recognition and reduces user operations. On the other hand, since the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
附图说明Description of the drawings
图1是本申请第一实施例提供的一种人脸图像的识别方法的实现流程图;FIG. 1 is an implementation flowchart of a face image recognition method provided by the first embodiment of the present application;
图2是本申请一实施例提供的视频文件的播放示意图;FIG. 2 is a schematic diagram of playing a video file provided by an embodiment of the present application;
图3是本申请第二实施例提供的一种人脸图像的识别方法具体实现流程图;FIG. 3 is a specific implementation flowchart of a face image recognition method provided by the second embodiment of the present application;
图4是本申请第三实施例提供的一种人脸图像的识别方法S105具体实现流程图;4 is a specific implementation flow chart of a face image recognition method S105 provided by the third embodiment of the present application;
图5是本申请第四实施例提供的一种人脸图像的识别方法S1051具体实现流程图;FIG. 5 is a specific implementation flowchart of a face image recognition method S1051 provided by the fourth embodiment of the present application;
图6是本申请第五实施例提供的一种人脸图像的识别方法S105具体实现流程图;FIG. 6 is a specific implementation flowchart of a face image recognition method S105 provided by the fifth embodiment of the present application;
图7是本申请第六实施例提供的一种人脸图像的识别方法S104具体实现流程图;FIG. 7 is a specific implementation flowchart of a face image recognition method S104 provided by the sixth embodiment of the present application;
图8是本申请第七实施例提供的一种人脸图像的识别方法S102具体实现流程图;FIG. 8 is a specific implementation flowchart of a face image recognition method S102 provided by the seventh embodiment of the present application;
图9是本申请一实施例提供的一种人脸图像的识别设备的结构框图;FIG. 9 is a structural block diagram of a face image recognition device provided by an embodiment of the present application;
图10是本申请另一实施例提供的一种终端设备的示意图。FIG. 10 is a schematic diagram of a terminal device provided by another embodiment of the present application.
本发明的实施方式Embodiments of the present invention
在本申请实施例中,流程的执行主体为终端设备。该终端设备包括但不限于:服务器、计算机、智能手机以及平板电脑等能够执行人脸图像的识别任务的设备。图1示出了本申请第一实施例提供的人脸图像的识别方法的实现流程图,详述如下:In the embodiment of the present application, the execution subject of the process is the terminal device. The terminal equipment includes, but is not limited to: servers, computers, smart phones, and tablet computers that can perform facial image recognition tasks. Fig. 1 shows a flow chart of the method for recognizing a face image provided by the first embodiment of the present application, and the details are as follows:
在S101中,接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识。In S101, a video play instruction is received; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played.
在本实施例中,用户可以向终端设备发送一个视频播放指令。具体地,用户可以通过终端设备配置的交互模块,例如键盘、鼠标或触控屏等模块,在终端设备本地触发视频播放指令;当然,用户也可以在本地的用户终端生成视频播放指令,并建立用户终端与终端设备之间的通信链路,通过通信链路将视频播放指令发送给终端设备,即用户终端相当于一遥控装置,可以控制终端设备执行视频播放操作。In this embodiment, the user can send a video play instruction to the terminal device. Specifically, the user can trigger the video playback instruction locally on the terminal device through the interactive module configured by the terminal device, such as a keyboard, mouse, or touch screen; of course, the user can also generate a video playback instruction on the local user terminal and establish The communication link between the user terminal and the terminal device sends video playback instructions to the terminal device through the communication link, that is, the user terminal is equivalent to a remote control device that can control the terminal device to perform video playback operations.
在本实施例中,用户发起视频播放操作时,可以从终端设备的可加载插件列表中选取所需调用的视频播放插件,通过点击或勾选等方式,从可加载插件列表中选择一个或多个视频播放插件,并在选取完成后点击播放按钮,此时,终端设备会识别用户选择完毕,并将用户选择的视频播放插件的插件标识添加到视频播放指令内,触发视频播放操作。可选地,终端设备可以配置有默认配置模式,即终端设备在执行视频播放操作时,可以默认加载一个或多个视频播放插件,无需用户在每次播放操作时,均需要重新进行插件选择,从而提高了用户的操作效率。例如,终端设备的默认配置模式为默认加载帧率优化插件以及人脸识别插件,则在检测到用户点击视频播放按钮且没有选择所需加载的视频播放插件时,将上述两个插件的插件标识添加到视频播放指令内,并生成视频播放指令。该默认配置模式中所需加载的视频播放插件可以由系统默认设置,也可以由用户手动配置。优选地,终端设备可以统计各个视频播放插件的使用次数,若检测到某一视频播放插件的使用次数大于预设的使用阈值,则提示用户是否将所述视频播放插件添加到默认配置模式,若接收到用户反馈的同意添加指令,则将使用次数大于使用阈值的视频播放插件添加到默认配置模式内,以便在后续的播放操作中,自动加载该视频播放插件。In this embodiment, when the user initiates a video playback operation, he can select the video playback plug-in to be called from the loadable plug-in list of the terminal device, and select one or more from the list of loadable plug-ins by clicking or checking. After selecting a video playback plug-in, click the play button. At this time, the terminal device will recognize that the user has selected the video playback plug-in, and add the plug-in identifier of the video playback plug-in selected by the user to the video playback instruction to trigger the video playback operation. Optionally, the terminal device can be configured with a default configuration mode, that is, when the terminal device performs a video playback operation, it can load one or more video playback plug-ins by default, without the need for the user to re-select plug-ins for each playback operation. Thereby improving the user's operating efficiency. For example, the default configuration mode of the terminal device is to load the frame rate optimization plug-in and the face recognition plug-in by default. When it is detected that the user clicks the video playback button and does not select the video playback plug-in to be loaded, the plug-in identification of the two plug-ins Add to the video playback instruction, and generate the video playback instruction. The video playback plug-in to be loaded in the default configuration mode can be set by default by the system, or can be manually configured by the user. Preferably, the terminal device can count the use times of each video playback plug-in, and if it detects that the use times of a certain video playback plug-in is greater than a preset use threshold, it prompts the user whether to add the video playback plug-in to the default configuration mode, if After receiving the user's feedback to agree to the addition instruction, the video playback plug-in whose use times are greater than the use threshold is added to the default configuration mode, so that the video playback plug-in is automatically loaded in subsequent playback operations.
具体地,该终端设备安装有VLC视频播放应用,该VLC视频播放应用具体为一个具有核心框架执行视频播放功能应用程序,该应用程序可以基于用户需求添加多个基于视频播放的插件,例如视频优化插件、视频录制插件以及本实施例中所需要调用的人脸识别插件。用户在执行播放操作时,在该视频播放指令中除了可以指定所需播放的视频文件外,还可以携带有所需加载的插件标识。举例性地,该视频播放指令可以为:vlc.exe--video-filter all,facereader test.mp4,其中,vlc.exe为所需启动的视频播放应用,而facereader即为插件标识;test.mp4为所需播放的视频文件的文件标识。Specifically, the terminal device is installed with a VLC video playback application. The VLC video playback application is specifically an application with a core framework to perform video playback functions. The application can add multiple video playback-based plug-ins based on user needs, such as video optimization. Plug-in, video recording plug-in, and face recognition plug-in that needs to be called in this embodiment. When the user performs a playback operation, the video playback instruction can not only specify the video file to be played, but also carry the plug-in identifier that needs to be loaded. For example, the video playback instruction can be: vlc.exe--video-filter all, facereader test.mp4, where vlc.exe is the video playback application that needs to be started, and facereader is the plug-in identifier; test.mp4 It is the file ID of the video file to be played.
在S102中,启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视 频播放应用。In S102, a video playback application is started, and the video playback plug-in is loaded to the video playback application based on the plug-in identifier.
在本实施例中,终端设备在接收到视频播放指令后,可以启动与指令关联的视频播放应用,终端设备还会对视频播放指令进行解析,提取视频播放插件对应的插件标识,查询各个插件标识对应的视频播放插件,将视频播放插件加载到视频播放应用,以扩展视频播放应用的功能。In this embodiment, after receiving the video playback instruction, the terminal device can start the video playback application associated with the instruction, and the terminal device will also parse the video playback instruction, extract the plug-in ID corresponding to the video playback plug-in, and query each plug-in ID The corresponding video playback plug-in loads the video playback plug-in to the video playback application to expand the functions of the video playback application.
在本实施例中,视频播放应用可以关联有一可加载插件列表,各个视频播放插件可以将启动声明文件存储在视频播放应用的安装地址内,在视频播放应用启动时,会检测所在安装位置中存储有的所有启动声明文件,从而生成该视频播放应用的可加载插件列表。终端设备会检测视频播放指令中的插件标识是否在可加载插件列表内,若是,查询视频播放插件的安装地址,并在视频播放应用的进程中创建一新的线程,通过该线程运行该视频播放插件的运行文件,以将视频播放插件加载至视频播放应用;若检测到插件标识不在可加载插件列表内,则输出插件不存在信息。In this embodiment, the video playback application can be associated with a list of loadable plug-ins. Each video playback plug-in can store the startup declaration file in the installation address of the video playback application. When the video playback application starts, it will detect the storage in the installation location. Some startup declaration files are generated to generate a list of loadable plug-ins for the video playback application. The terminal device will detect whether the plug-in identifier in the video playback instruction is in the list of loadable plug-ins, if so, query the installation address of the video playback plug-in, and create a new thread in the process of the video playback application, and run the video playback through this thread The running file of the plug-in to load the video playback plug-in to the video playback application; if it is detected that the plug-in ID is not in the list of loadable plug-ins, the plug-in non-existence information is output.
可选地,在本实施例中,若检测到视频播放指令携带的插件标识不在可加载插件列表内,则可以基于插件标识生成插件下载请求,将插件下载请求发送给视频播放应用对应的服务器,以便从服务器下载与插件标识对应的插件运行文件,并在下载完毕后,将插件标识添加到可加载插件列表内,以将视频播放插件加载至视频播放应用。Optionally, in this embodiment, if it is detected that the plug-in identifier carried by the video play instruction is not in the list of loadable plug-ins, a plug-in download request may be generated based on the plug-in identifier, and the plug-in download request is sent to the server corresponding to the video playback application. In order to download the plug-in running file corresponding to the plug-in ID from the server, and after the download is completed, add the plug-in ID to the list of loadable plug-ins to load the video playback plug-in to the video playback application.
在S103中,若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧。In S103, if the plug-in ID matches the ID of the face recognition plug-in, each video image frame of the video file is extracted by the video playback application after the plug-in is loaded.
在本实施例中,终端设备对视频播放指令进行解析,确定所需播放的视频文件的文件标识,并基于文件标识获取视频文件,将视频文件导入视频播放应用,通过视频播放应用输出视频文件。视频播放应用在输出视频文件时,会读取视频文件包含的各个视频图像帧,并基于各个视频图像帧的帧序号,以预设的视频播放帧率依次输出各个视频播放帧,例如,该视频播放帧率可以为60dps,即每秒输出60幅视频图像帧。In this embodiment, the terminal device parses the video playback instruction, determines the file identifier of the video file to be played, and obtains the video file based on the file identifier, imports the video file into the video playback application, and outputs the video file through the video playback application. When the video playback application outputs a video file, it reads each video image frame contained in the video file, and based on the frame number of each video image frame, outputs each video playback frame in turn at the preset video playback frame rate, for example, the video The playback frame rate can be 60dps, that is, 60 video image frames are output per second.
在本实施例中,由于视频播放应用可以加载有多个视频播放插件,终端设备可以在视频播放应用的进程内创建多条并行处理线程,并通过各个不同并行处理线程运行各个视频播放插件对应的视频处理操作,而不同的视频播放插件对所需输入的数据类型不同,例如对于人声优化插件,则输入的数据类型为音频信号;而对于视频画面提亮插件,则输入的数据类型为视频图像帧。因此,终端设备在播放视频文件过程之前,需要确定当前加载的视频播放插件,并基于各个不同视频播放插件所需输入的数据类型,从视频文件提取对应的数据,并通过对应的并发线程处理该数据。若终端设备检测到需要加载的视频播放插件包含人脸识别插件,即插件标识与人脸识别插件的标识匹配,则需要获取人脸识别插件的输入数据,由于人脸识别插件是对各个视频图像帧进行识别,即输入的数据类型为视频图像帧,因此可以在视频应用播放视频文件时,根据帧序号依次提取的各个视频图像帧,并导入到人脸识别插件,进行人脸识别操作。In this embodiment, since the video playback application can be loaded with multiple video playback plug-ins, the terminal device can create multiple parallel processing threads in the process of the video playback application, and run the corresponding video playback plug-ins through different parallel processing threads. Video processing operations, and different video playback plug-ins have different types of data that need to be input. For example, for a vocal optimization plug-in, the input data type is audio signal; and for video screen brightening plug-ins, the input data type is video Image frame. Therefore, the terminal device needs to determine the currently loaded video playback plug-in before playing the video file, and extract the corresponding data from the video file based on the type of data input required by each different video playback plug-in, and process the data through the corresponding concurrent thread. data. If the terminal device detects that the video playback plug-in that needs to be loaded contains a face recognition plug-in, that is, the plug-in ID matches the ID of the face recognition plug-in, it needs to obtain the input data of the face recognition plug-in, because the face recognition plug-in is for each video image Frame recognition, that is, the input data type is a video image frame, so when the video application plays the video file, each video image frame can be extracted in sequence according to the frame number and imported into the face recognition plug-in for face recognition operation.
在S104中,调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像。In S104, the face recognition plug-in is called to extract the face images contained in each of the video image frames.
在本实施例中,视频播放应用在播放过程中,可以将视频图像帧输出到图像处理单元GPU进行显示输出流程,与此同时,可以将视频图像帧导入到人脸识别插件,人脸识别插件会对输入的视频图像帧进行解析,通过内置的人脸识别算法提取各个视频图像包含的人脸图像。由于被拍摄的对象可以为多个,因此一个视频图像帧内包含的人脸图像的数量可以为多个。In this embodiment, during the playback process of the video playback application, the video image frames can be output to the image processing unit GPU for display output flow. At the same time, the video image frames can be imported into the face recognition plug-in, the face recognition plug-in The input video image frame will be analyzed, and the face image contained in each video image will be extracted through the built-in face recognition algorithm. Since there can be multiple objects to be photographed, the number of face images contained in one video image frame can be multiple.
具体地,人脸识别插件获取人脸图像的方式可以为:通过内置的多尺寸的人脸模板在视频图像帧中进行滑动框取,并计算框取的区域图像与人脸模板之间的匹配度,若检测到两者之间的匹配度大于预设的匹配阈值,则识别当前框取的区域图像包含人脸,识别该区域图像为人脸图像。Specifically, the way for the face recognition plug-in to obtain the face image may be: use the built-in multi-size face template to slide the frame in the video image frame, and calculate the match between the framed area image and the face template If it is detected that the matching degree between the two is greater than the preset matching threshold, it is recognized that the current framed area image contains a human face, and the area image is recognized as a human face image.
可选地,在本实施例中,终端设备可以先进行人脸识别,确定视频图像帧包含的人脸图像后,再执行视频播放操作。在该情况下,终端设备可以将识别得到的人脸图像在视频 图像帧中进行标记,例如通过矩形框标记人脸区域,终端设备输出标记后的视频图像帧,从而方便用户快速确定视频文件的人脸图像。参见图2所示,图2示出了本申请一实施例提供的视频文件的播放示意图。Optionally, in this embodiment, the terminal device may first perform face recognition, and after determining the face image contained in the video image frame, perform the video playback operation. In this case, the terminal device can mark the recognized face image in the video image frame, for example, a rectangular frame is used to mark the face area, and the terminal device outputs the marked video image frame, so that the user can quickly determine the video file Face image. Refer to FIG. 2, which shows a schematic diagram of playing a video file provided by an embodiment of the present application.
在S105中,根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。In S105, a face image library of the video file is established according to the entity user corresponding to each of the face images.
在本实施例中,终端设备在获取了各个视频图像帧的人脸图像后,可以确定各个人脸图像所属的实体用户,并基于所属的实体用户对人脸图像进行分类,建立视频文件的人脸图像库。In this embodiment, after acquiring the face image of each video image frame, the terminal device can determine the entity user to which each face image belongs, and classify the face image based on the entity user to which it belongs, and create the person who created the video file. Face image library.
具体地,在本实施例中,终端设备可以计算任意两个人脸图像之间的相似度,例如将各个人脸图像转换为人脸特征向量,并计算两个人脸特征向量之间的欧氏距离,将该欧氏距离的倒数作为两者的相似度,将相似度较大的多个人脸区域图像识别为同一实体用户,区分属于不同实体用户的人脸图像,实现了人脸图像的分类,并为属于同一实体用户的人脸区域图像标记一个用户标识。Specifically, in this embodiment, the terminal device can calculate the similarity between any two face images, for example, convert each face image into a face feature vector, and calculate the Euclidean distance between the two face feature vectors, Taking the reciprocal of the Euclidean distance as the similarity between the two, multiple facial region images with greater similarity are recognized as the same entity user, and facial images belonging to different entity users are distinguished, thus realizing the classification of facial images, and Mark a user ID for the face area image belonging to the same entity user.
以上可以看出,本申请实施例提供的一种人脸图像的识别方法通过接收用户发起的视频播放指令,启动视频播放应用,并加载该视频播放指令指示的视频播放插件,以对视频播放应用的功能进行扩展,若检测到加载的视频播放插件为人脸识别插件,则可以通过视频播放应用对视频文件进行解析,提取各个视频图像帧,并将各个视频图像帧导入到人脸识别插件,以通过人脸识别插件识别各个视频图像帧包含的人脸图像,并根据所有人脸图像构建视频文件关联的人脸图像库,从而实现了对动态的视频文件进行人脸识别。与现有的人脸图像的识别技术相比,本申请无需用户手动截取视频图像帧,并交由其他应用进行人脸图像识别,而是可以通过在视频播放应用中加载人脸识别插件,在播放视频文件的同时,自动识别每个视频图像帧所包含的人脸图像,提高了人脸图像的识别效率,减少了用户的操作。另一方面,由于人脸图像的识别过程与视频文件的播放是同步进行的,用户无需在视频文件观看后,再执行人脸图像识别,从而减少了人脸识别操作的耗时。It can be seen from the above that the face image recognition method provided by the embodiment of the present application receives a user-initiated video playback instruction, starts a video playback application, and loads the video playback plug-in indicated by the video playback instruction to apply to the video playback If it is detected that the loaded video playback plug-in is a face recognition plug-in, you can parse the video file through the video playback application, extract each video image frame, and import each video image frame into the face recognition plug-in. The face recognition plug-in recognizes the face images contained in each video image frame, and builds a face image library associated with the video file based on all face images, thereby realizing face recognition for dynamic video files. Compared with the existing facial image recognition technology, this application does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition. Instead, the face recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the efficiency of face image recognition and reduces user operations. On the other hand, since the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
图3示出了本申请第二实施例提供的一种人脸图像的识别方法的具体实现流程图。参见图3,相对于图1所述实施例,本实施例提供的一种人脸图像的识别方法中在所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用之前,还包括:S301~S303,具体详述如下:Fig. 3 shows a specific implementation flowchart of a face image recognition method provided by the second embodiment of the present application. Referring to FIG. 3, with respect to the embodiment described in FIG. 1, in the method for recognizing a face image provided by this embodiment, the video playback application is started, and the video playback plug-in is loaded to the Before the video playback application, it also includes: S301~S303, the details are as follows:
进一步地,在所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用之前,还包括:Further, before the starting the video playback application and loading the video playback plug-in to the video playback application based on the plug-in identifier, the method further includes:
在S301中,获取所述人脸识别插件的插件数据包。In S301, a plug-in data package of the face recognition plug-in is acquired.
在本实施例中,终端设备可以通过移动存储设备或者网络下载的方式,获取人脸识别插件的插件数据包。可选地,终端设备可以对获取得到的插件数据包的完备性进行检测,例如通过提取插件数据包的CRC校验码来确定该插件数据包是否完整。In this embodiment, the terminal device can obtain the plug-in data package of the face recognition plug-in through a mobile storage device or a network download. Optionally, the terminal device may detect the completeness of the obtained plug-in data package, for example, by extracting the CRC check code of the plug-in data package to determine whether the plug-in data package is complete.
优选地,在本实施例中,终端设备可以运行上述插件数据包,并将预设的测试图像导入运行的进程,获取输出结果,若输出结果内标记的人脸图像的位置与预设的标准坐标匹配,则识别该插件数据包为完备数据包,执行S302的操作;反之,若无法运行或无法识别测试图像包含的人脸图像,则识别该插件数据包为异常数据包,重新获取人脸识别插件的插件数据包。Preferably, in this embodiment, the terminal device can run the above plug-in data package and import the preset test image into the running process to obtain the output result. If the position of the face image marked in the output result is consistent with the preset standard If the coordinates match, the plug-in data package is recognized as a complete data package, and the operation of S302 is executed; otherwise, if the face image contained in the test image cannot be run or cannot be recognized, the plug-in data package is recognized as an abnormal data package, and the face is retrieved again Identify the plug-in data package of the plug-in.
在S302中,向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识。In S302, a version verification request is sent to the server, and a legal verification result fed back by the server based on the version verification request is received; the version verification request includes the version identification of the plug-in data package.
在本实施例中,终端设备可以提取插件数据包携带的版本标识,并生成携带所述版本标识的版本校验请求,向视频播放应用对应的服务器发送版本校验请求。由于需要将人脸识别插件加载至视频播放应用,因此需要视频播放应用于人脸识别插件兼容。若服务器检测到该人脸识别插件与视频播放应用相互兼容,则可以返回校验成功的合法校验结果给终端设备;反之,若服务器检测到该视频播放插件与视频播放应用不兼容,则返回检验是被的合法校验结果。In this embodiment, the terminal device may extract the version identifier carried in the plug-in data packet, generate a version verification request carrying the version identifier, and send the version verification request to the server corresponding to the video playback application. Since the face recognition plug-in needs to be loaded into the video playback application, the video playback application needs to be compatible with the face recognition plug-in. If the server detects that the face recognition plug-in is compatible with the video playback application, it can return a successful legal verification result to the terminal device; otherwise, if the server detects that the video playback plug-in is not compatible with the video playback application, it returns The verification is the result of the legal verification.
可选地,若合法校验结果为检验失败,则终端设备可以提取该合法校验结果携带有的下载链接,通过所述下载链接重新获取人脸识别插件的插件数据包。在服务器检测到当前的人脸识别插件与视频播放应用之间不兼容,可以提供一个兼容的人脸播放插件的下载链接,以使终端设备可以通过该下载链接获取合法的人脸识别插件。Optionally, if the legal verification result is that the verification failed, the terminal device may extract the download link carried in the legal verification result, and retrieve the plug-in data package of the face recognition plug-in through the download link. When the server detects that the current face recognition plug-in is incompatible with the video playback application, a download link of a compatible face playback plug-in can be provided, so that the terminal device can obtain a legal face recognition plug-in through the download link.
在S303中,若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。In S303, if the legal verification result is a successful verification, query the installation location of the video playback application, and add the call statement file in the plug-in data package to the file directory associated with the installation location To add the face recognition plug-in to the callable plug-in list of the video playback application.
在本实施例中,终端设备若确定该插件数据包为合法数据包,即合法校验结果为校验成功,则提取插件数据包内包含的调用声明文件,并查询视频播放应用的安装位置,将提取得到的调用声明文件添加到该安装位置对应的文件目录内,由于视频播放应用在启动过程中,会检测安装位置对应的文件目录中包含的声明文件,并基于包含的声明文件生成可调用插件列表,通过上述操作,则视频播放应用可以在后续播放操作中调用人脸识别插件。In this embodiment, if the terminal device determines that the plug-in data package is a legal data package, that is, the legal verification result is a successful verification, it extracts the call declaration file contained in the plug-in data package and queries the installation location of the video playback application. Add the extracted call declaration file to the file directory corresponding to the installation location, because the video playback application will detect the declaration file contained in the file directory corresponding to the installation location during the startup process, and generate a callable based on the included declaration file Plug-in list, through the above operations, the video playback application can call the face recognition plug-in in subsequent playback operations.
举例性地,该视频播放应用为VLC视频播放应用,该VLC视频播放应用的安装位置为//plugins,即将人脸识别插件的调用声明文件添加至该安装位置所对应的目录下,即将FaceReader插件的libfacereader_plugin.dll文件放到vlc播放器安装目录的plugins目录。For example, the video playback application is a VLC video playback application, and the installation location of the VLC video playback application is //plugins, that is, the call statement file of the face recognition plug-in is added to the directory corresponding to the installation location, that is, the FaceReader plug-in Place the libfacereader_plugin.dll file in the plugins directory of the vlc player installation directory.
在本申请实施例中,通过对插件数据包进行合法性检测,可以确保人脸识别插件与视频播放应用之间相互兼容,并且在原有视频播放应用中添加人脸识别插件,可以提高人脸识别操作的便捷性。In the embodiment of this application, by checking the validity of the plug-in data package, it can be ensured that the face recognition plug-in and the video playback application are compatible with each other, and adding the face recognition plug-in to the original video playback application can improve the face recognition. Convenience of operation.
图4示出了本申请第三实施例提供的一种人脸图像的识别方法S105的具体实现流程图。参见图4,相对于图1所述的实施例,本实施例提供的一种人脸图像的识别方法中S105包括:S1051~S1054,具体详述如下:FIG. 4 shows a specific implementation flowchart of a face image recognition method S105 provided by the third embodiment of the present application. Referring to FIG. 4, with respect to the embodiment described in FIG. 1, in the face image recognition method provided in this embodiment, S105 includes: S1051 to S1054, and the details are as follows:
进一步地,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:Further, the establishment of the face image database of the video file according to the entity user corresponding to each of the face images includes:
在S1051中,计算任意两个所述视频图像帧内的所述人脸图像之间的相似度。In S1051, the similarity between the face images in any two video image frames is calculated.
在本实施例中,终端设备提取了每个视频图像帧内包含的人脸图像后,可以从任意两个视频图像帧中选取的人脸图像进行相似度计算操作,例如第一视频图像帧包含有人脸图像A、人脸图像B以及人脸图像C,而第二视频图像帧包含有人脸图像A’、人脸图像B’以及人脸图像C’,终端设备可以从第一视频图像帧中提取人脸图像A,以及从第二视频图像帧中提取人脸图像C’,并计算人脸图像A以及人脸图像C’之间的相似度。其中,相似度的计算可以参见S105的描述,即将人脸图像转换为人脸特征向量,并计算位于两个视频图像帧上的人脸特征向量之间的欧氏距离,并将欧氏距离的倒数作为上述两个人脸图像之间的相速度。In this embodiment, after the terminal device extracts the face images contained in each video image frame, it can perform similarity calculation operations on the face images selected from any two video image frames. For example, the first video image frame contains Face image A, face image B, and face image C, and the second video image frame contains face image A', face image B', and face image C', and the terminal device can retrieve the image from the first video image frame The face image A is extracted, the face image C'is extracted from the second video image frame, and the similarity between the face image A and the face image C'is calculated. Among them, the calculation of similarity can refer to the description of S105, that is, the face image is converted into a face feature vector, and the Euclidean distance between the face feature vectors located on two video image frames is calculated, and the inverse of the Euclidean distance is calculated. As the phase velocity between the above two face images.
优选地,终端设备在计算相似度时,会选取相邻的视频图像帧中的人脸图像进行比对,并计算各个人脸图像的中心坐标之间的差值,将中心坐标之间的差值小于预设的距离阈值的人脸图像识别为目标图像,并计算目标图像之间的相似度。由于两个视频图像帧相邻,则人脸移动的距离较小,可以选取人脸移动距离较小的人脸图像为目标图像,并计算目标图像之间的相似度,从而能够减少大量无效的人脸相似度的计算操作,提高了人脸库的构建速度。Preferably, when the terminal device calculates the similarity, it selects face images in adjacent video image frames for comparison, and calculates the difference between the center coordinates of each face image, and calculates the difference between the center coordinates. Face images whose values are less than the preset distance threshold are recognized as target images, and the similarity between the target images is calculated. Since the two video image frames are adjacent, the distance of the face movement is small. The face image with the smaller face movement distance can be selected as the target image, and the similarity between the target images can be calculated, thereby reducing a large number of invalid The calculation operation of face similarity improves the construction speed of the face database.
在S1052中,若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系。In S1052, if the similarity is greater than a preset association threshold, the face images located in two different video image frames are identified as associated images, and the relationship between the two face images is established connection relation.
在本实施例中,终端设备设置有关联阈值,若检测到两个人脸图像之间的相似度小于预设的关联阈值,则判定上述两个人脸图像属于不同的实体用户,则无需建立两者之间的关联关系;反之,若检测到两个人脸图像之间的相似度大于预设的关联阈值,则判定上述两个人脸图像属于相同的实体用户,识别上述两个人脸图像互为关联图像,并建立上述两个人脸图像之间的关联关系,以便根据关联关系确定属于同一实体用户的所有人脸图像。In this embodiment, the terminal device is set with an association threshold. If it is detected that the similarity between the two face images is less than the preset association threshold, it is determined that the two face images belong to different entity users, and there is no need to establish both. Conversely, if it is detected that the similarity between the two face images is greater than the preset association threshold, it is determined that the two face images belong to the same entity user, and the two face images are identified as related images. , And establish an association relationship between the above two face images, so as to determine all face images belonging to the same entity user according to the association relationship.
在S1053中,基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为 各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像。In S1053, based on the association relationship of all the face images, divide into multiple user face groups, and configure a user identifier for each of the user face groups; all faces in the user face group The images are the related images.
在本实施例中,终端设备在确定了所有人脸图像之间的关联关系后,可以根据关联关系对人脸图像进行划分,将所有互为关联图像划分至一个用户人脸组,由于两个人脸图像为关联图像,则表示两个关联图像属于同一实体用户,实现了基于实体用户对人脸图像进行分组,并为划分所有划分得到的人脸组配置用户标识,该用户标识可以为用户序号,其中该用户序号可以根据用户人脸组内最早人脸图像的出现时间的先后次序确定。In this embodiment, after determining the association relationship between all face images, the terminal device can divide the face images according to the association relationship, and divide all mutually related images into a user face group. If the face image is a related image, it means that the two related images belong to the same entity user, which realizes the grouping of face images based on the entity user, and configures the user ID for all the divided face groups. The user ID can be the user serial number , Where the user serial number can be determined according to the sequence of the appearance time of the earliest face image in the user's face group.
可选地,在本实施例中,终端设备可以存储有候选用户的标准人脸图像,终端设备可以根据各个用户人脸组内的人脸图像与标准人脸图像进行匹配,并根据匹配结果识别该候选用户关联的用户人脸组,并将候选用户的用户标识作为该用户人脸组的用户标识。Optionally, in this embodiment, the terminal device may store the standard face image of the candidate user, and the terminal device may match the standard face image with the face image in the face group of each user, and identify according to the matching result The user face group associated with the candidate user, and the user identification of the candidate user is used as the user identification of the user face group.
在S1054中,根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。In S1054, the face image database is established according to the user face group and the user identifier.
在本实施例中,终端设备在识别了所有用户人脸组以及为每个用户人脸组配置用户标识后,可以构建关于视频文件的人脸图像库。In this embodiment, after the terminal device has identified all user face groups and configured a user identity for each user face group, it can construct a face image library about video files.
在本申请实施例中,通过计算人脸图像之间的相似度,识别得到关联图像,并基于关联图像之间的关联关系,划分为属于相同实体用户的用户人脸组,实现了人脸图像的分类,提高了人脸图像的管理效率。In the embodiment of the present application, by calculating the similarity between the face images, the related images are recognized, and based on the association relationship between the related images, they are divided into user face groups belonging to the same entity user, thereby realizing the face image The classification improves the management efficiency of face images.
图5示出了本申请第四实施例提供的一种人脸图像的识别方法S1051的具体实现流程图。参见图5,相对于图4所述实施例,本实施例提供的一种人脸图像的识别方法S1051包括:S501~S505,具体详述如下:Fig. 5 shows a specific implementation flowchart of a face image recognition method S1051 provided by the fourth embodiment of the present application. Referring to FIG. 5, with respect to the embodiment described in FIG. 4, a face image recognition method S1051 provided in this embodiment includes: S501 to S505, which are detailed as follows:
进一步地,所述计算任意两个所述视频图像帧内的所述人脸图像之间的相似度,包括:Further, the calculating the similarity between the face images in any two video image frames includes:
在S501中,基于预设的人脸关键特征列表,标记出所述人脸图像中关于各个人脸关键特征的特征坐标。In S501, based on a preset list of key face features, the feature coordinates of each key feature of the face in the face image are marked.
在本实施例中,终端设备可以根据所需定位的人脸特征,配置人脸关键特征列表,例如,该人脸关键特征列表可以包含:眼、耳、口、鼻四个人脸特征,还可以包括眉毛、额头等,具体包含的人脸特征可以根据识别准确度的不同进行对应配置。终端设备可以根据人脸关键特征列表,在人脸图像中标记出各个人脸关键特征,并根据各个人脸关键特征在人脸图像内的坐标,得到特征坐标。In this embodiment, the terminal device can configure a face key feature list according to the face features to be located. For example, the face key feature list can include four facial features: eyes, ears, mouth, and nose. Including eyebrows, forehead, etc., the specific facial features included can be configured according to different recognition accuracy. The terminal device can mark each key feature of the face in the face image according to the list of key features of the face, and obtain the feature coordinates according to the coordinates of each key feature of the face in the face image.
在S502中,根据所述人脸关键特征列表内的所有人脸关键特征的特征坐标,构建所述人脸图像的特征坐标序列。In S502, the feature coordinate sequence of the face image is constructed according to the feature coordinates of the key features of all faces in the list of key features of the face.
在本实施例中,终端设备可以配置有序列模板,规定了各个人脸关键特征在序列模板内关联的位置,根据人脸图像中各个人脸关键特征的特征坐标,依次导入到序列模板中关联的位置,从而生成了人脸图像对应的特征坐标序列。In this embodiment, the terminal device can be configured with a sequence template, which specifies the position of each key feature of the face in the sequence template. According to the feature coordinates of each key feature of the face in the face image, they are sequentially imported into the sequence template for association The position of the face image corresponding to the feature coordinate sequence is generated.
在S503中,计算任意两个所述视频图像帧的所述人脸图像的特征坐标序列之间的特征距离值。In S503, the feature distance value between the feature coordinate sequence of the face image of any two video image frames is calculated.
在本实施例中,终端设备在确定了人脸图像的特征坐标序列后,可以通过欧氏距离计算公式等坐标距离计算公式,计算两个特征坐标序列之间的特征距离值。In this embodiment, after determining the feature coordinate sequence of the face image, the terminal device can calculate the feature distance value between the two feature coordinate sequences through the Euclidean distance calculation formula and the coordinate distance calculation formula.
在S504中,识别任意两个所述视频图像帧之间的间隔图像帧数。In S504, identify the number of interval image frames between any two of the video image frames.
在本实施例中,终端设备还可以识别两个视频图像帧的帧序号,并基于帧序号之间的差值,确定两个视频图像帧之间的间隔图像帧数。例如,若某一视频图像帧的帧序号为65,而另一视频图像帧的帧序号为68,则间隔图像帧数为68-65=3。In this embodiment, the terminal device can also identify the frame sequence numbers of two video image frames, and determine the number of interval image frames between the two video image frames based on the difference between the frame sequence numbers. For example, if the frame number of a certain video image frame is 65, and the frame number of another video image frame is 68, the number of interval image frames is 68-65=3.
在S505中,将所述特征距离值以及所述间隔图像帧数导入预设的相似度计算模型,得到两个所述视频图像帧内的所述人脸图像之间的相似度;所述相似度计算模型具体为:In S505, the feature distance value and the number of interval image frames are imported into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; the similarity The specific degree calculation model is:
Figure PCTCN2020105861-appb-000001
Figure PCTCN2020105861-appb-000001
其中,Similarity为所述相似度;ActFrame为所述间隔图像帧数;FigDist为所述特征 距离值;BaseDist为基准距离值;BaseFrame为所述视频文件的拍摄帧率;StandardDist为预设调整系数Wherein, Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient
在本实施例中,由于两个视频图像帧之间的间隔图像帧数的数值越大,则同一实体用户的人脸移动距离越长,因此在计算两个人脸图像之间的相似度时,可以通过间隔图像帧数进行归一化处理,从而减少了帧数之间的差异而对相似度计算的影响。并根据特征距离值与标准距离值之间的差值,判断特征距离之间的差异度,从而计算出两个人脸图像之间的相似度,通过人脸特征坐标,来进行相似度计算。In this embodiment, since the larger the value of the number of interval image frames between two video image frames, the longer the moving distance of the face of the same entity user. Therefore, when calculating the similarity between two face images, The normalization process can be performed through the number of interval image frames, thereby reducing the influence of the difference between the number of frames on the similarity calculation. And according to the difference between the feature distance value and the standard distance value, the difference between the feature distance is judged, and the similarity between the two face images is calculated, and the similarity calculation is performed through the face feature coordinates.
在本申请实施例中,通过识别人脸特征坐标,并构建人脸特征序列,并通过两个视频图像帧之间的间隔图像帧数以及人脸特征序列之间的特征距离值,计算人脸图像的相似度,提高了相似度计算的准确性。In the embodiment of the present application, by identifying the facial feature coordinates, constructing a facial feature sequence, and calculating the facial feature based on the number of interval image frames between two video image frames and the feature distance value between the facial feature sequences The similarity of the image improves the accuracy of the similarity calculation.
图6示出了本申请第五实施例提供的一种人脸图像的识别方法S105的具体实现流程图。参见图6,相对于图1-5任一所述实施例,本实施例提供的一种人脸图像的识别方法S105包括:S601~S603,具体详述如下:FIG. 6 shows a specific implementation flowchart of a face image recognition method S105 provided by the fifth embodiment of the present application. Referring to Fig. 6, with respect to any of the embodiments described in Figs. 1-5, a face image recognition method S105 provided in this embodiment includes: S601 to S603, which are detailed as follows:
进一步地,所述若当前时间在所述有效时长内,则通过所述页面数据显示所述网络页面,包括:Further, if the current time is within the effective duration, displaying the web page through the page data includes:
在S601中,确定所述人脸图像的表情类型,并识别所述表情类型为基准表情。In S601, the expression type of the face image is determined, and the expression type is recognized as a reference expression.
在本实施例中,终端设备可以通过表情识别算法,确定该人脸图像中拍摄时刻对应的表情类型。例如,表情类型可以为:微笑类型、大笑类型、哭泣类型、伤心类型等,并识别该人脸图像的表情类型为基准表情。In this embodiment, the terminal device may use an expression recognition algorithm to determine the type of expression corresponding to the moment of shooting in the face image. For example, the expression type may be: smiling type, laughing type, crying type, sad type, etc., and the expression type of the face image is recognized as a reference expression.
在S602中,根据表情转换算法以及所述基准表情,输出所述人脸图像的衍生图像;所述衍生图像的表情类型与所述人脸图像的表情类型不同。In S602, according to the expression conversion algorithm and the reference expression, a derivative image of the face image is output; the expression type of the derivative image is different from the expression type of the face image.
在本实施例中,终端设备可以根据基准表情调整表情转换算法,确定如何从基准表情转换到其他表情,例如从微笑表情调整到哭泣表情,以及从微笑表情调整到大笑表情。终端设备可以将人脸图像导入到根据基准表情调整的表情转换算法,可以输出对应不同标签类型的衍生图像。In this embodiment, the terminal device can adjust the expression conversion algorithm according to the reference expression to determine how to convert from the reference expression to other expressions, for example, from a smiling expression to a crying expression, and from a smiling expression to a laughing expression. The terminal device can import the face image into the expression conversion algorithm adjusted according to the reference expression, and can output derivative images corresponding to different tag types.
在S603中,根据所述人脸图像以及所述衍生图像,生成所述人脸图像库。In S603, the face image library is generated according to the face image and the derivative image.
在本实施例中,终端设备在输出了人脸图像以及基于人脸图像得到的不同表情的衍生图像后,可以建立人脸图像库,从而能够扩展人脸图像库的内容。In this embodiment, after the terminal device outputs the face image and the derived images of different expressions obtained based on the face image, the face image library can be established, so that the content of the face image library can be expanded.
在本申请实施例中,基于人脸图像得到多个不同标签的衍生图像,从而能够提高人脸图像库的丰富度。In the embodiment of the present application, multiple derivative images with different tags are obtained based on the face image, so that the richness of the face image library can be improved.
图7示出了本申请第六实施例提供的一种人脸图像的识别方法S104的具体实现流程图。参见图7,相对于图1至图5任一所述实施例,本实施例提供的一种人脸图像的识别方法中S104包括:S1041~S1043,具体详述如下:FIG. 7 shows a specific implementation flowchart of a face image recognition method S104 provided by the sixth embodiment of the present application. Referring to Fig. 7, with respect to any one of the embodiments described in Figs. 1 to 5, S104 in a face image recognition method provided in this embodiment includes: S1041 to S1043, which are detailed as follows:
进一步地,所述调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像,包括:Further, the invoking the face recognition plug-in to extract the face images contained in each of the video image frames includes:
在S1041中,基于视频图像帧中的RGB通道对应的图像数据,生成所述视频图像帧的图像矩阵。In S1041, based on the image data corresponding to the RGB channels in the video image frame, an image matrix of the video image frame is generated.
在本实施例中,终端设备在获取了视频图像帧后,在导入到人脸识别插件之间,可以对视频图像帧进行预处理,具体的操作为,根据视频图像帧在RGB三个通道上的图像数据进行数据融合,生成关于视频图像帧的图像矩阵。In this embodiment, after acquiring the video image frame, the terminal device can preprocess the video image frame before importing it into the face recognition plug-in. The specific operation is based on the video image frame on the three RGB channels. Perform data fusion on the image data to generate an image matrix about the video image frame.
在S1042中,根据所述图像矩阵的矩阵尺寸,配置所述视频图像帧对应的卷积核,并通过所述卷积核对所述图像矩阵进行卷积操作,得到标准矩阵。In S1042, configure a convolution kernel corresponding to the video image frame according to the matrix size of the image matrix, and perform a convolution operation on the image matrix through the convolution kernel to obtain a standard matrix.
在本实施例中,终端设备在生成了视频图像帧的图像矩阵后,可以识别该图像矩阵的矩阵尺寸,并基于矩阵尺寸以及标准矩阵的矩阵尺寸,确定卷积操作所使用的卷积核,并通过该卷积核对图像矩阵进行卷积操作,从而将图像矩阵的尺寸调整为标准尺寸,得到标 准矩阵。In this embodiment, after the terminal device generates the image matrix of the video image frame, it can identify the matrix size of the image matrix, and determine the convolution kernel used for the convolution operation based on the matrix size and the matrix size of the standard matrix. And through the convolution kernel, the image matrix is convolved, so that the size of the image matrix is adjusted to the standard size, and the standard matrix is obtained.
在S1043中,将所述标准矩阵导入所述人脸识别插件的人脸识别算法,输出所述人脸图像。In S1043, the standard matrix is imported into the face recognition algorithm of the face recognition plug-in, and the face image is output.
在本实施例中,将标准矩阵导入到人脸识别插件,通过人脸识别插件内的人脸识别算法对标准矩阵进行解析,输出人脸图像。In this embodiment, the standard matrix is imported into the face recognition plug-in, the standard matrix is analyzed by the face recognition algorithm in the face recognition plug-in, and the face image is output.
在本申请实施例中,通过对视频图像帧进行预处理,得到标准矩阵,将标准矩阵导入到人脸识别插件,输出人脸图像,从而提高了人脸识别插件对于视频图像帧的处理效率以及人脸识别的准确率。In the embodiment of the present application, the standard matrix is obtained by preprocessing the video image frame, and the standard matrix is imported into the face recognition plug-in to output the face image, thereby improving the processing efficiency of the face recognition plug-in on the video image frame and The accuracy of face recognition.
图8示出了本申请第七实施例提供的一种人脸图像的识别方法S102的具体实现流程图。参见图8,相对于图1至图5任一所述实施例,本实施例提供的一种人脸图像的识别方法中S102包括:S1021~S1022,具体详述如下:FIG. 8 shows a specific implementation flowchart of a face image recognition method S102 provided by the seventh embodiment of the present application. Referring to Fig. 8, with respect to any one of the embodiments described in Figs. 1 to 5, S102 in a face image recognition method provided in this embodiment includes: S1021 to S1022, and the details are as follows:
在S1021中,根据预设的播放插件寻址表,查询所述插件标识对应的所述视频播放插件的安装地址。In S1021, query the installation address of the video playback plug-in corresponding to the plug-in identifier according to a preset playback plug-in addressing table.
在本实施例中,终端设备启动了视频播放应用,需要加载视频播放插件时,需要确定视频播放插件的安装位置,以允许对应的插件文件,因此,可以根据播放插件寻址表,查询插件标识关联的存储地址,该存储地址即为视频播放插件的安装地址。In this embodiment, the terminal device starts the video playback application, and when the video playback plug-in needs to be loaded, the installation location of the video playback plug-in needs to be determined to allow the corresponding plug-in file. Therefore, the plug-in identification can be queried according to the playback plug-in addressing table. The associated storage address, which is the installation address of the video playback plug-in.
在S1022中,从所述安装地址获取所述视频播放插件的插件文件,通过所述视频播放应用运行所述插件文件,以将所述视频播放插件加载至所述视频播放应用。In S1022, the plug-in file of the video playback plug-in is acquired from the installation address, and the plug-in file is run through the video playback application to load the video playback plug-in to the video playback application.
在本实施例中,终端设备从安装地址提取视频播放插件的插件文件,通过视频播放应用运行插件文件,即在视频播放应用对应的进程下创建新的并发线程,通过并发线程执行插件文件,实现将视频播放插件加载至视频播放应用。In this embodiment, the terminal device extracts the plug-in file of the video playback plug-in from the installation address, runs the plug-in file through the video playback application, that is, creates a new concurrent thread under the process corresponding to the video playback application, and executes the plug-in file through the concurrent thread. Load the video playback plug-in to the video playback application.
在本申请实施例中,通过播放插件寻址表确定视频播放应用的安装位置,实现将视频播放插件加载至视频播放应用,提高了加载操作的效率。In the embodiment of the present application, the installation location of the video playback application is determined by the playback plug-in addressing table, so that the video playback plug-in can be loaded into the video playback application, and the efficiency of the loading operation is improved.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
图9示出了本申请一实施例提供的一种人脸图像的识别设备的结构框图,该人脸图像的识别设备包括的各单元用于执行图1对应的实施例中的各步骤。具体请参阅图1与图1所对应的实施例中的相关描述。为了便于说明,仅示出了与本实施例相关的部分。FIG. 9 shows a structural block diagram of a facial image recognition device provided by an embodiment of the present application, and each unit included in the facial image recognition device is used to execute each step in the embodiment corresponding to FIG. 1. For details, please refer to related descriptions in the embodiments corresponding to FIG. 1 and FIG. 1. For ease of description, only the parts related to this embodiment are shown.
参见图9,所述人脸图像的识别设备包括:Referring to Figure 9, the facial image recognition device includes:
视频播放指令接收单元91,用于接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识;The video play instruction receiving unit 91 is configured to receive a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
视频播放应用启动单元92,用于启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;The video playback application starting unit 92 is configured to start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
视频图像帧提取单元93,用于若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;The video image frame extraction unit 93 is configured to extract each video image frame of the video file through the video playback application after the plug-in is loaded if the plug-in ID matches the ID of the face recognition plug-in;
人脸图像识别单元94,用于调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;The face image recognition unit 94 is configured to call the face recognition plug-in to extract the face images contained in each of the video image frames;
人脸图像库建立单元95,用于根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。The face image database establishment unit 95 is configured to establish a face image database of the video file according to the entity user corresponding to each of the face images.
可选地,所述人脸图像的识别设备还包括:Optionally, the facial image recognition device further includes:
插件数据包获取单元,用于获取所述人脸识别插件的插件数据包;A plug-in data package acquisition unit for acquiring the plug-in data package of the face recognition plug-in;
合法校验结果接收单元,用于向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识;The legal verification result receiving unit is configured to send a version verification request to the server and receive the legal verification result fed back by the server based on the version verification request; the version verification request includes the version of the plug-in data package Logo
调用声明文件添加单元,用于若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。The calling statement file adding unit is used to query the installation location of the video playback application if the legal verification result is successful, and add the calling statement file in the plug-in data package to the installation location association To add the face recognition plug-in to the list of callable plug-ins of the video playback application.
可选地,所述人脸图像库建立单元95包括:Optionally, the face image database establishment unit 95 includes:
相似度计算单元,用于计算任意两个所述视频图像帧内的所述人脸图像之间的相似度;A similarity calculation unit, configured to calculate the similarity between the face images in any two video image frames;
关联关系建立单元,用于若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系;An association relationship establishment unit, configured to, if the similarity is greater than a preset association threshold, identify the face images located in two different video image frames as associated images, and establish two face images The relationship between
用户人脸组划分单元,用于基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像;The user face group dividing unit is configured to divide into multiple user face groups based on the association relationship of all the face images, and configure a user identifier for each of the user face groups; the user face group All face images in the image are the associated images;
第一人脸图像库建立单元,用于根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。The first face image database establishment unit is configured to establish the face image database according to the user face group and the user identifier.
可选地,所述相似度计算单元包括:Optionally, the similarity calculation unit includes:
特征坐标标记单元,用于基于预设的人脸关键特征列表,标记出所述人脸图像中关于各个人脸关键特征的特征坐标;The feature coordinate marking unit is configured to mark the feature coordinates of each key feature of the face in the face image based on a preset list of key features of the face;
特征坐标序列生成单元,用于根据所述人脸关键特征列表内的所有人脸关键特征的特征坐标,构建所述人脸图像的特征坐标序列;The feature coordinate sequence generating unit is configured to construct the feature coordinate sequence of the face image according to the feature coordinates of the key features of all faces in the face key feature list;
特征距离值计算单元,用于计算任意两个所述视频图像帧的所述人脸图像的特征坐标序列之间的特征距离值;A feature distance value calculation unit, configured to calculate feature distance values between the feature coordinate sequences of the face images of any two video image frames;
间隔图像帧数识别单元,用于识别任意两个所述视频图像帧之间的间隔图像帧数;Interval image frame number identification unit for identifying the number of interval image frames between any two video image frames;
相似度转换单元,用于将所述特征距离值以及所述间隔图像帧数导入预设的相似度计算模型,得到两个所述视频图像帧内的所述人脸图像之间的相似度;所述相似度计算模型具体为:A similarity conversion unit, configured to import the characteristic distance value and the number of interval image frames into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; The similarity calculation model is specifically:
Figure PCTCN2020105861-appb-000002
Figure PCTCN2020105861-appb-000002
其中,Similarity为所述相似度;ActFrame为所述间隔图像帧数;FigDist为所述特征距离值;BaseDist为基准距离值;BaseFrame为所述视频文件的拍摄帧率;StandardDist为预设调整系数。Wherein, Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient.
可选地,所述人脸图像库建立单元95包括:Optionally, the face image database establishment unit 95 includes:
基准表情识别单元,用于确定所述人脸图像的表情类型,并识别所述表情类型为基准表情;A reference expression recognition unit, configured to determine the expression type of the face image, and recognize the expression type as a reference expression;
衍生图像输出单元,用于根据表情转换算法以及所述基准表情,输出所述人脸图像的衍生图像;所述衍生图像的表情类型与所述人脸图像的表情类型不同;A derivative image output unit, configured to output a derivative image of the face image according to the expression conversion algorithm and the reference expression; the expression type of the derivative image is different from the expression type of the face image;
第二人脸图像库建立单元,用于根据所述人脸图像以及所述衍生图像,生成所述人脸图像库。The second face image database establishment unit is configured to generate the face image database according to the face image and the derivative image.
可选地,所述人脸图像识别单元94包括:Optionally, the face image recognition unit 94 includes:
图像矩阵生成单元,用于基于视频图像帧中的RGB通道对应的图像数据,生成所述视频图像帧的图像矩阵;An image matrix generating unit, configured to generate an image matrix of the video image frame based on the image data corresponding to the RGB channels in the video image frame;
标准矩阵生成单元,用于根据所述图像矩阵的矩阵尺寸,配置所述视频图像帧对应的卷积核,并通过所述卷积核对所述图像矩阵进行卷积操作,得到标准矩阵;A standard matrix generating unit, configured to configure a convolution kernel corresponding to the video image frame according to the matrix size of the image matrix, and perform a convolution operation on the image matrix through the convolution kernel to obtain a standard matrix;
标准矩阵导入单元,用于将所述标准矩阵导入所述人脸识别插件的人脸识别算法,输出所述人脸图像。The standard matrix importing unit is configured to import the standard matrix into the face recognition algorithm of the face recognition plug-in, and output the face image.
可选地,所述视频播放应用启动单元92包括:Optionally, the video playback application starting unit 92 includes:
安装地址获取单元,用于根据预设的播放插件寻址表,查询所述插件标识对应的所述视频播放插件的安装地址;The installation address obtaining unit is configured to query the installation address of the video playback plug-in corresponding to the plug-in identifier according to a preset playback plug-in addressing table;
视频播放插件加载单元,用于从所述安装地址获取所述视频播放插件的插件文件,通 过所述视频播放应用运行所述插件文件,以将所述视频播放插件加载至所述视频播放应用。The video playback plug-in loading unit is configured to obtain the plug-in file of the video playback plug-in from the installation address, and run the plug-in file through the video playback application to load the video playback plug-in to the video playback application.
因此,本申请实施例提供的人脸图像的识别设备同样可以无需用户手动截取视频图像帧,并交由其他应用进行人脸图像识别,而是可以通过在视频播放应用中加载人脸识别插件,在播放视频文件的同时,自动识别每个视频图像帧所包含的人脸图像,提高了人脸图像的识别效率,减少了用户的操作。另一方面,由于人脸图像的识别过程与视频文件的播放是同步进行的,用户无需在视频文件观看后,再执行人脸图像识别,从而减少了人脸识别操作的耗时。Therefore, the facial image recognition device provided by the embodiment of the present application also does not require the user to manually intercept the video image frame and hand it over to other applications for facial image recognition. Instead, the facial recognition plug-in can be loaded in the video playback application. While playing the video file, the face image contained in each video image frame is automatically recognized, which improves the recognition efficiency of the face image and reduces the user's operation. On the other hand, since the recognition process of the face image is synchronized with the playback of the video file, the user does not need to perform the face image recognition after watching the video file, thereby reducing the time-consuming operation of the face recognition operation.
图10是本申请另一实施例提供的一种终端设备的示意图。如图10所示,该实施例的终端设备10包括:处理器100、存储器101以及存储在所述存储器101中并可在所述处理器100上运行的计算机可读指令102,例如人脸图像的识别程序。所述处理器100执行所述计算机可读指令102时实现上述各个人脸图像的识别方法实施例中的步骤,例如图1所示的S101至S105。或者,所述处理器100执行所述计算机可读指令102时实现上述各装置实施例中各单元的功能,例如图8所示模块91至95功能。FIG. 10 is a schematic diagram of a terminal device provided by another embodiment of the present application. As shown in FIG. 10, the terminal device 10 of this embodiment includes: a processor 100, a memory 101, and computer-readable instructions 102 stored in the memory 101 and running on the processor 100, such as a human face image Recognition procedures. When the processor 100 executes the computer-readable instructions 102, the steps in the above-mentioned face image recognition method embodiments are implemented, for example, S101 to S105 shown in FIG. 1. Alternatively, when the processor 100 executes the computer-readable instructions 102, the functions of the units in the foregoing device embodiments, for example, the functions of the modules 91 to 95 shown in FIG. 8 are realized.
示例性的,所述计算机可读指令102可以被分割成一个或多个单元,所述一个或者多个单元被存储在所述存储器101中,并由所述处理器100执行,以完成本申请。所述一个或多个单元可以是能够完成特定功能的一系列计算机可读指令指令段,该指令段用于描述所述计算机可读指令102在所述终端设备10中的执行过程。例如,所述计算机可读指令102可以被分割成视频播放指令接收单元、视频播放应用启动单元、视频图像帧提取单元、人脸图像识别单元以及人脸图像库建立单元,各单元具体功能如上所述。Exemplarily, the computer-readable instruction 102 may be divided into one or more units, and the one or more units are stored in the memory 101 and executed by the processor 100 to complete the present application . The one or more units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instruction 102 in the terminal device 10. For example, the computer-readable instruction 102 can be divided into a video playback instruction receiving unit, a video playback application startup unit, a video image frame extraction unit, a face image recognition unit, and a face image library establishment unit. The specific functions of each unit are as described above. Narrated.
所述终端设备10可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备可包括,但不仅限于,处理器100、存储器101。本领域技术人员可以理解,图10仅仅是终端设备10的示例,并不构成对终端设备10的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备还可以包括输入输出设备、网络接入设备、总线等。The terminal device 10 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The terminal device may include, but is not limited to, a processor 100 and a memory 101. Those skilled in the art can understand that FIG. 10 is only an example of the terminal device 10, and does not constitute a limitation on the terminal device 10. It may include more or fewer components than shown in the figure, or a combination of certain components, or different components. For example, the terminal device may also include input and output devices, network access devices, buses, and so on.
所称处理器100可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 100 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
所述存储器101可以是所述终端设备10的内部存储单元,例如终端设备10的硬盘或内存。所述存储器101也可以是所述终端设备10的外部存储设备,例如所述终端设备10上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器101还可以既包括所述终端设备10的内部存储单元也包括外部存储设备。所述存储器101用于存储所述计算机可读指令以及所述终端设备所需的其他程序和数据。所述存储器101还可以用于暂时地存储已经输出或者将要输出的数据。另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。The memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10. The memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk equipped on the terminal device 10, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device. The memory 101 is used to store the computer-readable instructions and other programs and data required by the terminal device. The memory 101 can also be used to temporarily store data that has been output or will be output. In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,也可以为易失性计算机可读存储介质。计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行所述人脸图像的识别方法的步骤。The present application also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium. The computer-readable storage medium stores computer instructions, and when the computer instructions are run on a computer, the computer is caused to execute the steps of the method for recognizing the face image.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施 例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种人脸图像的识别方法,其中,包括:A face image recognition method, which includes:
    接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识;Receiving a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
    启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;Start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
    若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;If the plug-in ID matches the ID of the face recognition plug-in, extract each video image frame of the video file through the video playback application after the plug-in is loaded;
    调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;Calling the face recognition plug-in to extract the face images contained in each of the video image frames;
    根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。According to the entity user corresponding to each of the facial images, a facial image library of the video file is established.
  2. 根据权利要求1所述的识别方法,其中,在所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用之前,还包括:The identification method according to claim 1, wherein before the starting the video playing application and loading the video playing plug-in to the video playing application based on the plug-in identifier, the method further comprises:
    获取所述人脸识别插件的插件数据包;Acquiring the plug-in data package of the face recognition plug-in;
    向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识;Sending a version verification request to the server, and receiving a legal verification result fed back by the server based on the version verification request; the version verification request includes the version identification of the plug-in data package;
    若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。If the legal verification result is a successful verification, query the installation location of the video playback application, and add the call declaration file in the plug-in data package to the file directory associated with the installation location to add all The face recognition plug-in to the callable plug-in list of the video playback application.
  3. 根据权利要求1所述的识别方法,其中,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:The recognition method according to claim 1, wherein said establishing a face image library of said video file according to the entity user corresponding to each said face image comprises:
    计算任意两个所述视频图像帧内的所述人脸图像之间的相似度;Calculating the similarity between the face images in any two video image frames;
    若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系;If the similarity is greater than a preset association threshold, identifying the face images located in two different video image frames as associated images, and establishing an association relationship between the two face images;
    基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像;Based on the association relationship of all the face images, divide them into multiple user face groups, and configure a user identifier for each of the user face groups; all face images in the user face groups are mutually exclusive The associated image;
    根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。The face image database is established according to the user face group and the user identifier.
  4. 根据权利要求3所述的识别方法,其中,所述计算任意两个所述视频图像帧内的所述人脸图像之间的相似度,包括:The recognition method according to claim 3, wherein the calculating the similarity between the face images in any two of the video image frames comprises:
    基于预设的人脸关键特征列表,标记出所述人脸图像中关于各个人脸关键特征的特征坐标;Based on a preset list of key face features, mark the feature coordinates of each key feature of the face in the face image;
    根据所述人脸关键特征列表内的所有人脸关键特征的特征坐标,构建所述人脸图像的特征坐标序列;Constructing the feature coordinate sequence of the face image according to the feature coordinates of the key features of all faces in the face key feature list;
    计算任意两个所述视频图像帧的所述人脸图像的特征坐标序列之间的特征距离值;Calculating the feature distance value between the feature coordinate sequences of the face image of any two video image frames;
    识别任意两个所述视频图像帧之间的间隔图像帧数;Identifying the number of interval image frames between any two video image frames;
    将所述特征距离值以及所述间隔图像帧数导入预设的相似度计算模型,得到两个所述视频图像帧内的所述人脸图像之间的相似度;所述相似度计算模型具体为:Import the characteristic distance value and the number of interval image frames into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; the similarity calculation model is specifically for:
    Figure PCTCN2020105861-appb-100001
    Figure PCTCN2020105861-appb-100001
    其中,Similarity为所述相似度;ActFrame为所述间隔图像帧数;FigDist为所述特征距离值;BaseDist为基准距离值;BaseFrame为所述视频文件的拍摄帧率;StandardDist为预设调整系数。Wherein, Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient.
  5. 根据权利要求1-4任一项所述的识别方法,其中,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:The recognition method according to any one of claims 1 to 4, wherein the establishing the face image library of the video file according to the entity user corresponding to each of the face images comprises:
    确定所述人脸图像的表情类型,并识别所述表情类型为基准表情;Determining the expression type of the face image, and identifying the expression type as a reference expression;
    根据表情转换算法以及所述基准表情,输出所述人脸图像的衍生图像;所述衍生图像的表情类型与所述人脸图像的表情类型不同;Outputting a derivative image of the face image according to the expression conversion algorithm and the reference expression; the expression type of the derivative image is different from the expression type of the face image;
    根据所述人脸图像以及所述衍生图像,生成所述人脸图像库。According to the face image and the derivative image, the face image library is generated.
  6. 根据权利要求1-4任一项所述的识别方法,其中,所述调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像,包括:The recognition method according to any one of claims 1 to 4, wherein the invoking the face recognition plug-in to extract the face images contained in each of the video image frames comprises:
    基于视频图像帧中的RGB通道对应的图像数据,生成所述视频图像帧的图像矩阵;Generating an image matrix of the video image frame based on the image data corresponding to the RGB channels in the video image frame;
    根据所述图像矩阵的矩阵尺寸,配置所述视频图像帧对应的卷积核,并通过所述卷积核对所述图像矩阵进行卷积操作,得到标准矩阵;Configure a convolution kernel corresponding to the video image frame according to the matrix size of the image matrix, and perform a convolution operation on the image matrix through the convolution kernel to obtain a standard matrix;
    将所述标准矩阵导入所述人脸识别插件的人脸识别算法,输出所述人脸图像。Import the standard matrix into the face recognition algorithm of the face recognition plug-in, and output the face image.
  7. 根据权利要求1-4任一项所述的识别方法,其中,所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用,包括:The identification method according to any one of claims 1 to 4, wherein the starting a video playing application and loading the video playing plug-in to the video playing application based on the plug-in identifier comprises:
    根据预设的播放插件寻址表,查询所述插件标识对应的所述视频播放插件的安装地址;Query the installation address of the video playback plug-in corresponding to the plug-in identifier according to the preset playback plug-in addressing table;
    从所述安装地址获取所述视频播放插件的插件文件,通过所述视频播放应用运行所述插件文件,以将所述视频播放插件加载至所述视频播放应用。Obtain the plug-in file of the video playback plug-in from the installation address, and run the plug-in file through the video playback application to load the video playback plug-in to the video playback application.
  8. 一种人脸图像的识别设备,其中,包括:A face image recognition device, which includes:
    视频播放指令接收单元,用于接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识;The video play instruction receiving unit is configured to receive the video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
    视频播放应用启动单元,用于启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;A video playback application starting unit, configured to start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
    视频图像帧提取单元,用于若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;The video image frame extraction unit is configured to extract each video image frame of the video file through the video playback application after the plug-in is loaded if the plug-in ID matches the ID of the face recognition plug-in;
    人脸图像识别单元,用于调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;The face image recognition unit is configured to call the face recognition plug-in to extract the face images contained in each of the video image frames;
    人脸图像库建立单元,用于根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。The face image database establishment unit is used to establish the face image database of the video file according to the entity user corresponding to each of the face images.
  9. 根据权利要求8所述的识别设备,其中,所述人脸图像的识别设备还包括:The recognition device according to claim 8, wherein the recognition device of the face image further comprises:
    插件数据包获取单元,用于获取所述人脸识别插件的插件数据包;A plug-in data package acquisition unit for acquiring the plug-in data package of the face recognition plug-in;
    合法校验结果接收单元,用于向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识;The legal verification result receiving unit is configured to send a version verification request to the server and receive the legal verification result fed back by the server based on the version verification request; the version verification request includes the version of the plug-in data package Logo
    调用声明文件添加单元,用于若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。The calling statement file adding unit is used to query the installation location of the video playback application if the legal verification result is successful, and add the calling statement file in the plug-in data package to the installation location association To add the face recognition plug-in to the list of callable plug-ins of the video playback application.
  10. 根据权利要求8所述的识别设备,其中,所述人脸图像库建立单元包括:8. The recognition device according to claim 8, wherein the face image database establishment unit comprises:
    相似度计算单元,用于计算任意两个所述视频图像帧内的所述人脸图像之间的相似度;A similarity calculation unit, configured to calculate the similarity between the face images in any two video image frames;
    关联关系建立单元,用于若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系;An association relationship establishment unit, configured to, if the similarity is greater than a preset association threshold, identify the face images located in two different video image frames as associated images, and establish two face images The relationship between
    用户人脸组划分单元,用于基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像;The user face group dividing unit is configured to divide into multiple user face groups based on the association relationship of all the face images, and configure a user identifier for each of the user face groups; the user face group All face images in the image are the associated images;
    第一人脸图像库建立单元,用于根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。The first face image database establishment unit is configured to establish the face image database according to the user face group and the user identifier.
  11. 一种终端设备,其中,所述终端设备包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时执行以下步骤:A terminal device, wherein the terminal device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and when the processor executes the computer-readable instructions Perform the following steps:
    接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插 件的插件标识;Receiving a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when playing the video file;
    启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;Start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
    若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;If the plug-in ID matches the ID of the face recognition plug-in, extract each video image frame of the video file through the video playback application after the plug-in is loaded;
    调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;Calling the face recognition plug-in to extract the face images contained in each of the video image frames;
    根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。According to the entity user corresponding to each of the facial images, a facial image library of the video file is established.
  12. 根据权利要求11所述的终端设备,其中,在所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用之前,所述处理器执行所述计算机可读指令时还执行以下步骤:The terminal device according to claim 11, wherein, before the video playback application is started and the video playback plug-in is loaded into the video playback application based on the plug-in identifier, the processor executes the computer readable The following steps are also performed when instructing:
    获取所述人脸识别插件的插件数据包;Acquiring the plug-in data package of the face recognition plug-in;
    向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识;Sending a version verification request to the server, and receiving a legal verification result fed back by the server based on the version verification request; the version verification request includes the version identification of the plug-in data package;
    若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。If the legal verification result is a successful verification, query the installation location of the video playback application, and add the call declaration file in the plug-in data package to the file directory associated with the installation location to add all The face recognition plug-in to the callable plug-in list of the video playback application.
  13. 根据权利要求11所述的中的终端设备,其中,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:The terminal device according to claim 11, wherein said establishing a face image library of said video file according to the entity user corresponding to each said face image comprises:
    计算任意两个所述视频图像帧内的所述人脸图像之间的相似度;Calculating the similarity between the face images in any two video image frames;
    若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系;If the similarity is greater than a preset association threshold, identifying the face images located in two different video image frames as associated images, and establishing an association relationship between the two face images;
    基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像;Based on the association relationship of all the face images, divide them into multiple user face groups, and configure a user identifier for each of the user face groups; all face images in the user face groups are mutually exclusive The associated image;
    根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。The face image database is established according to the user face group and the user identifier.
  14. 根据权利要求13所述的终端设备,其中,所述计算任意两个所述视频图像帧内的所述人脸图像之间的相似度,包括:The terminal device according to claim 13, wherein the calculating the similarity between the face images in any two of the video image frames comprises:
    基于预设的人脸关键特征列表,标记出所述人脸图像中关于各个人脸关键特征的特征坐标;Based on a preset list of key face features, mark the feature coordinates of each key feature of the face in the face image;
    根据所述人脸关键特征列表内的所有人脸关键特征的特征坐标,构建所述人脸图像的特征坐标序列;Constructing the feature coordinate sequence of the face image according to the feature coordinates of the key features of all faces in the face key feature list;
    计算任意两个所述视频图像帧的所述人脸图像的特征坐标序列之间的特征距离值;Calculating the feature distance value between the feature coordinate sequences of the face image of any two video image frames;
    识别任意两个所述视频图像帧之间的间隔图像帧数;Identifying the number of interval image frames between any two video image frames;
    将所述特征距离值以及所述间隔图像帧数导入预设的相似度计算模型,得到两个所述视频图像帧内的所述人脸图像之间的相似度;所述相似度计算模型具体为:Import the characteristic distance value and the number of interval image frames into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; the similarity calculation model is specifically for:
    Figure PCTCN2020105861-appb-100002
    Figure PCTCN2020105861-appb-100002
    其中,Similarity为所述相似度;ActFrame为所述间隔图像帧数;FigDist为所述特征距离值;BaseDist为基准距离值;BaseFrame为所述视频文件的拍摄帧率;StandardDist为预设调整系数。Wherein, Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient.
  15. 根据权利要求11-14任一项所述的终端设备,其中,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:The terminal device according to any one of claims 11-14, wherein the establishing the face image library of the video file according to the entity user corresponding to each of the face images comprises:
    确定所述人脸图像的表情类型,并识别所述表情类型为基准表情;Determining the expression type of the face image, and identifying the expression type as a reference expression;
    根据表情转换算法以及所述基准表情,输出所述人脸图像的衍生图像;所述衍生图像的表情类型与所述人脸图像的表情类型不同;Outputting a derivative image of the face image according to the expression conversion algorithm and the reference expression; the expression type of the derivative image is different from the expression type of the face image;
    根据所述人脸图像以及所述衍生图像,生成所述人脸图像库。According to the face image and the derivative image, the face image library is generated.
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现如下步骤:A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein, when the computer-readable instructions are executed by a processor, the following steps are implemented:
    接收视频播放指令;所述视频播放指令携带有播放视频文件时所需调用的视频播放插件的插件标识;Receiving a video play instruction; the video play instruction carries the plug-in identifier of the video play plug-in that needs to be called when the video file is played;
    启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用;Start a video playback application, and load the video playback plug-in to the video playback application based on the plug-in identifier;
    若所述插件标识与人脸识别插件的标识匹配,则通过加载插件后的所述视频播放应用提取所述视频文件的各个视频图像帧;If the plug-in ID matches the ID of the face recognition plug-in, extract each video image frame of the video file through the video playback application after the plug-in is loaded;
    调用所述人脸识别插件提取各个所述视频图像帧包含的人脸图像;Calling the face recognition plug-in to extract the face images contained in each of the video image frames;
    根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库。According to the entity user corresponding to each of the facial images, a facial image library of the video file is established.
  17. 根据权利要求16所述的计算机可读存储介质,其中,在所述启动视频播放应用,并基于所述插件标识加载所述视频播放插件至所述视频播放应用之前,所述计算机可读指令被处理器执行时还实现如下步骤:The computer-readable storage medium according to claim 16, wherein the computer-readable instructions are executed before the video playback application is started and the video playback plug-in is loaded to the video playback application based on the plug-in identifier. The processor also implements the following steps when executing:
    获取所述人脸识别插件的插件数据包;Acquiring the plug-in data package of the face recognition plug-in;
    向服务器发送版本校验请求,并接收所述服务器基于所述版本校验请求反馈的合法校验结果;所述版本校验请求包含所述插件数据包的版本标识;Sending a version verification request to the server, and receiving a legal verification result fed back by the server based on the version verification request; the version verification request includes the version identification of the plug-in data package;
    若所述合法校验结果为校验成功,则查询所述视频播放应用的安装位置,并将所述插件数据包内的调用声明文件添加到所述安装位置关联的文件目录内,以添加所述人脸识别插件至所述视频播放应用的可调用插件列表。If the legal verification result is a successful verification, query the installation location of the video playback application, and add the call declaration file in the plug-in data package to the file directory associated with the installation location to add all The face recognition plug-in to the callable plug-in list of the video playback application.
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述根据各个所述人脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:16. The computer-readable storage medium according to claim 16, wherein said establishing a face image library of said video file according to the entity user corresponding to each said face image comprises:
    计算任意两个所述视频图像帧内的所述人脸图像之间的相似度;Calculating the similarity between the face images in any two video image frames;
    若所述相似度大于预设的关联阈值,则识别位于两个不同的所述视频图像帧内的所述人脸图像为关联图像,建立两个所述人脸图像之间的关联关系;If the similarity is greater than a preset association threshold, identifying the face images located in two different video image frames as associated images, and establishing an association relationship between the two face images;
    基于所有所述人脸图像的所述关联关系,划分为多个用户人脸组,并为各个所述用户人脸组配置用户标识;所述用户人脸组内的所有人脸图像互为所述关联图像;Based on the association relationship of all the face images, divide them into multiple user face groups, and configure a user identifier for each of the user face groups; all face images in the user face groups are mutually exclusive The associated image;
    根据所述用户人脸组以及所述用户标识,建立所述人脸图像库。The face image database is established according to the user face group and the user identifier.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述计算任意两个所述视频图像帧内的所述人脸图像之间的相似度,包括:18. The computer-readable storage medium of claim 18, wherein the calculating the similarity between the face images in any two of the video image frames comprises:
    基于预设的人脸关键特征列表,标记出所述人脸图像中关于各个人脸关键特征的特征坐标;Based on a preset list of key face features, mark the feature coordinates of each key feature of the face in the face image;
    根据所述人脸关键特征列表内的所有人脸关键特征的特征坐标,构建所述人脸图像的特征坐标序列;Constructing the feature coordinate sequence of the face image according to the feature coordinates of the key features of all faces in the face key feature list;
    计算任意两个所述视频图像帧的所述人脸图像的特征坐标序列之间的特征距离值;Calculating the feature distance value between the feature coordinate sequences of the face image of any two video image frames;
    识别任意两个所述视频图像帧之间的间隔图像帧数;Identifying the number of interval image frames between any two video image frames;
    将所述特征距离值以及所述间隔图像帧数导入预设的相似度计算模型,得到两个所述视频图像帧内的所述人脸图像之间的相似度;所述相似度计算模型具体为:Import the characteristic distance value and the number of interval image frames into a preset similarity calculation model to obtain the similarity between the face images in the two video image frames; the similarity calculation model is specifically for:
    Figure PCTCN2020105861-appb-100003
    Figure PCTCN2020105861-appb-100003
    其中,Similarity为所述相似度;ActFrame为所述间隔图像帧数;FigDist为所述特征距离值;BaseDist为基准距离值;BaseFrame为所述视频文件的拍摄帧率;StandardDist为预设调整系数。Wherein, Similarity is the similarity; ActFrame is the number of interval image frames; FigDist is the characteristic distance value; BaseDist is the reference distance value; BaseFrame is the shooting frame rate of the video file; StandardDist is the preset adjustment coefficient.
  20. 如权利要求16-19任一项所述的计算机可读存储介质,其中,所述根据各个所述人 脸图像对应的实体用户,建立所述视频文件的人脸图像库,包括:22. The computer-readable storage medium according to any one of claims 16-19, wherein the establishing the face image library of the video file according to the entity user corresponding to each of the face images comprises:
    确定所述人脸图像的表情类型,并识别所述表情类型为基准表情;Determining the expression type of the face image, and identifying the expression type as a reference expression;
    根据表情转换算法以及所述基准表情,输出所述人脸图像的衍生图像;所述衍生图像的表情类型与所述人脸图像的表情类型不同;Outputting a derivative image of the face image according to the expression conversion algorithm and the reference expression; the expression type of the derivative image is different from the expression type of the face image;
    根据所述人脸图像以及所述衍生图像,生成所述人脸图像库。According to the face image and the derivative image, the face image library is generated.
PCT/CN2020/105861 2020-02-11 2020-07-30 Face image recognition method and apparatus WO2021159672A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010087125.8 2020-02-11
CN202010087125.8A CN111290800A (en) 2020-02-11 2020-02-11 Face image recognition method and device

Publications (1)

Publication Number Publication Date
WO2021159672A1 true WO2021159672A1 (en) 2021-08-19

Family

ID=71029984

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105861 WO2021159672A1 (en) 2020-02-11 2020-07-30 Face image recognition method and apparatus

Country Status (2)

Country Link
CN (1) CN111290800A (en)
WO (1) WO2021159672A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095781A (en) * 2021-11-02 2022-02-25 北京鲸鲮信息系统技术有限公司 Multimedia data processing method and device, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111290800A (en) * 2020-02-11 2020-06-16 深圳壹账通智能科技有限公司 Face image recognition method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567720A (en) * 2011-12-26 2012-07-11 广州市千钧网络科技有限公司 Face identification method and face identification device for Flash online video
CN107123079A (en) * 2016-02-24 2017-09-01 掌赢信息科技(上海)有限公司 One kind expression moving method and electronic equipment
CN108875531A (en) * 2018-01-18 2018-11-23 北京迈格威科技有限公司 Method for detecting human face, device, system and computer storage medium
CN110472516A (en) * 2019-07-23 2019-11-19 腾讯科技(深圳)有限公司 A kind of construction method, device, equipment and the system of character image identifying system
US20200005065A1 (en) * 2018-06-28 2020-01-02 Google Llc Object classification for image recognition processing
CN110990604A (en) * 2019-11-28 2020-04-10 浙江大华技术股份有限公司 Image base generation method, face recognition method and intelligent access control system
CN111290800A (en) * 2020-02-11 2020-06-16 深圳壹账通智能科技有限公司 Face image recognition method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567720A (en) * 2011-12-26 2012-07-11 广州市千钧网络科技有限公司 Face identification method and face identification device for Flash online video
CN107123079A (en) * 2016-02-24 2017-09-01 掌赢信息科技(上海)有限公司 One kind expression moving method and electronic equipment
CN108875531A (en) * 2018-01-18 2018-11-23 北京迈格威科技有限公司 Method for detecting human face, device, system and computer storage medium
US20200005065A1 (en) * 2018-06-28 2020-01-02 Google Llc Object classification for image recognition processing
CN110472516A (en) * 2019-07-23 2019-11-19 腾讯科技(深圳)有限公司 A kind of construction method, device, equipment and the system of character image identifying system
CN110990604A (en) * 2019-11-28 2020-04-10 浙江大华技术股份有限公司 Image base generation method, face recognition method and intelligent access control system
CN111290800A (en) * 2020-02-11 2020-06-16 深圳壹账通智能科技有限公司 Face image recognition method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114095781A (en) * 2021-11-02 2022-02-25 北京鲸鲮信息系统技术有限公司 Multimedia data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111290800A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN110135246B (en) Human body action recognition method and device
CN110147717B (en) Human body action recognition method and device
US11481428B2 (en) Bullet screen content processing method, application server, and user terminal
US20210192202A1 (en) Recognizing text in image data
WO2021012526A1 (en) Face recognition model training method, face recognition method and apparatus, device, and storage medium
WO2021143624A1 (en) Video tag determination method, device, terminal, and storage medium
US9639532B2 (en) Context-based analysis of multimedia content items using signatures of multimedia elements and matching concepts
WO2021213067A1 (en) Object display method and apparatus, device and storage medium
WO2021203823A1 (en) Image classification method and apparatus, storage medium, and electronic device
CN110245469B (en) Webpage watermark generation method, watermark analysis method, device and storage medium
WO2019019361A1 (en) Method and apparatus for processing data of database, computer device, and storage medium
WO2020259449A1 (en) Method and device for generating short video
WO2021159672A1 (en) Face image recognition method and apparatus
CN109491736B (en) Display method and device of pop-up frame window
WO2021237570A1 (en) Image auditing method and apparatus, device, and storage medium
US20160196478A1 (en) Image processing method and device
CN109376069B (en) Method and device for generating test report
KR20140045897A (en) Device and method for media stream recognition based on visual image matching
US20230017112A1 (en) Image generation method and apparatus
JP2021034003A (en) Human object recognition method, apparatus, electronic device, storage medium, and program
US20220270396A1 (en) Palm print recognition method, method for training feature extraction model, device, and medium
WO2020244151A1 (en) Image processing method and apparatus, terminal, and storage medium
WO2021151317A1 (en) Living-body detection method, apparatus, electronic device, and storage medium
US20210224322A1 (en) Image search system, image search method and storage medium
US10115012B1 (en) Capture object boundary jitter reduction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20919084

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 091222)

122 Ep: pct application non-entry in european phase

Ref document number: 20919084

Country of ref document: EP

Kind code of ref document: A1