WO2020019591A1 - Method and device used for generating information - Google Patents

Method and device used for generating information Download PDF

Info

Publication number
WO2020019591A1
WO2020019591A1 PCT/CN2018/116182 CN2018116182W WO2020019591A1 WO 2020019591 A1 WO2020019591 A1 WO 2020019591A1 CN 2018116182 W CN2018116182 W CN 2018116182W WO 2020019591 A1 WO2020019591 A1 WO 2020019591A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
face image
recognition model
extracted
information
Prior art date
Application number
PCT/CN2018/116182
Other languages
French (fr)
Chinese (zh)
Inventor
陈日伟
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020019591A1 publication Critical patent/WO2020019591A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Definitions

  • Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating information.
  • the key points of the face refer to the points in the face that have obvious semantic discrimination, such as the points corresponding to the nose and the points corresponding to the eyes.
  • the detection of key points on a face usually refers to the detection of key points on a face.
  • functions such as adding special effects, 3D modeling of the face, and taking pictures of the beauty can be realized.
  • the embodiments of the present application provide a method and device for generating information.
  • an embodiment of the present application provides a method for generating information.
  • the method includes: acquiring an image to be identified, where the image to be identified includes a face image; extracting a face image from the image to be identified, and extracting the extracted image.
  • the candidate recognition model that matches the recognition result of the second recognition model is used as the second recognition model.
  • the candidate recognition model in the candidate recognition model set is a pre-trained model for identifying different classes of faces to generate key point information.
  • the face image is input to the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize the position of the key point in the face image in the face image.
  • inputting the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image includes: inputting the extracted face image into a pre-trained first recognition model, Obtain the recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference keypoint information is used to characterize the position of the reference keypoint in the face image in the face image; and input the extracted face image into the selected Obtaining the keypoint information corresponding to the extracted face image includes: inputting the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain the extracted face image Corresponding key point information.
  • extracting a face image from an image to be identified includes: inputting the image to be identified into a pre-trained third recognition model, and obtaining a position for characterizing a position of a face image in the image to be identified in the image to be identified Information; based on the obtained position information, a face image is extracted from the image to be identified.
  • inputting the extracted face image into the selected second recognition model, and obtaining key point information corresponding to the extracted face image includes: inputting the extracted face image into the selected second recognition model.
  • the matching information includes a degree of matching used to characterize the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model Match index.
  • obtaining the image to be identified includes: selecting an image to be identified from an image sequence corresponding to the target video, where the target video is a video obtained by photographing a face.
  • the method further includes: selecting an image sequence located in the image sequence.
  • the image after the image to be recognized and adjacent to the image to be recognized is a candidate to be recognized image; a face image is extracted from the candidate to be recognized image as a candidate face image, and the extracted face image in the image to be recognized is used as a reference face Image; determining whether the matching index in the matching information corresponding to the determined reference face image meets a preset condition; and in response to determining yes, inputting the extracted candidate face image into the second reference model entered in the determined reference face image To obtain keypoint information and matching information corresponding to the extracted candidate face image.
  • an embodiment of the present application provides an apparatus for generating information.
  • the apparatus includes: an image acquiring unit configured to acquire an image to be identified, where the image to be identified includes a face image; and a first input unit, which is Configured to extract a face image from an image to be recognized, and input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the corresponding face image
  • the type of face the model selection unit is configured to select a candidate recognition model matching the obtained recognition result from the candidate recognition model set as the second recognition model, wherein the candidate recognition model in the candidate recognition model set is a A trained model for identifying different types of faces to generate key point information; a second input unit configured to input the extracted face image into the selected second recognition model to obtain a corresponding face image of the extracted face Keypoint information, where the keypoint information is used to characterize the position of a keypoint in the face image in the face image.
  • the first input unit is further configured to: input the extracted face image into a pre-trained first recognition model to obtain a recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference The key point information is used to characterize the position of the reference key point in the face image in the face image; and the second input unit is further configured to: input the extracted face image and the obtained reference key point information into the selected second Recognize the model and obtain key point information corresponding to the extracted face image.
  • the first input unit includes: a first input module configured to input an image to be identified into a pre-trained third recognition model, to obtain a characterization of a face image in the image to be identified in the image to be identified; The location information of the location; the image extraction module is configured to extract a face image from the image to be identified based on the obtained location information.
  • the second input unit is further configured to input the extracted face image into the selected second recognition model to obtain keypoint information and matching information corresponding to the extracted face image, wherein the matching information A matching index is used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
  • the image acquisition unit is further configured to select an image to be identified from an image sequence corresponding to the target video, where the target video is a video obtained by photographing a face.
  • the apparatus further includes: an image selection unit configured to select an image located behind the image to be identified and adjacent to the image to be identified from the image sequence as a candidate image to be identified; and the image determination unit is configured to Extracting a face image from a candidate to-be-recognized image as a candidate face image, and determining the extracted face image in the to-be-recognized image as a reference face image; a condition determining unit configured to determine a corresponding one of the determined reference face image Whether the matching index in the matching information meets a preset condition; the third input unit is configured to, in response to determining yes, input the extracted candidate face image into the second reference model input to the determined reference face image to obtain the extracted Key point information and matching information corresponding to the candidate face image of.
  • an embodiment of the present application provides an electronic device including: one or more processors; a storage device that stores one or more programs thereon; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method of any one of the foregoing methods for generating information.
  • an embodiment of the present application provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method of any one of the foregoing methods for generating information.
  • the method and device for generating information obtain an extracted image by acquiring an image to be identified, and then extracting a face image from the image to be identified, and inputting the extracted face image into a pre-trained first recognition model.
  • the recognition result corresponding to the face image wherein the recognition result is used to characterize the category of the face corresponding to the face image, and then a candidate recognition model matching the obtained recognition result is selected from the candidate recognition model set as the second recognition model ,
  • pre-trained candidate recognition models for identifying different types of faces can be used to identify face images to generate key point information, and then different types of face images can be identified, improving the comprehensiveness of information generation, and, Recognition using a candidate recognition model that matches the category corresponding to the face image can provide The accuracy of the information generated.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for generating information according to the present application
  • FIG. 3 is a schematic diagram of an application scenario of a method for generating information according to an embodiment of the present application
  • FIG. 4 is a flowchart of still another embodiment of a method for generating information according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 of an embodiment of a method for generating information or an apparatus for generating information to which the present application can be applied.
  • the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105.
  • the network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various types of connections, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as image processing applications, Meitu software, web browser applications, search applications, and social platform software.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal device 101, 102, 103 When the terminal device 101, 102, 103 is hardware, it can be various electronic devices with a display screen, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Pictures Experts Group Audio Layer III, Motion picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer 4), player, laptop portable computer and desktop computer, etc.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (such as multiple software or software modules used to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the server 105 may be a server that provides various services, for example, an image processing server that processes an image to be identified sent by the terminal devices 101, 102, and 103.
  • the image processing server may perform analysis and processing on the received data such as the image to be identified, and obtain a processing result (for example, key point information).
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster consisting of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, multiple software or software modules used to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • the above-mentioned system architecture may not include a network, but only a terminal device or a server.
  • a flowchart 200 of one embodiment of a method for generating information according to the present application is shown.
  • the method for generating information includes the following steps:
  • Step 201 Obtain an image to be identified.
  • an execution subject for example, a server shown in FIG. 1
  • the image to be identified may include a face image.
  • the face image included in the image to be identified may include an animal face image, and may also include a human face image.
  • the animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
  • execution subject may obtain the image to be identified stored in advance locally, or may obtain the image to be identified sent by an electronic device (such as the terminal device shown in FIG. 1) communicatively connected thereto.
  • the execution subject may select an image to be identified from an image sequence corresponding to a target video, where the target video is a video that can be obtained by shooting a face. Specifically, the execution subject may first obtain an image sequence corresponding to the target video from a local or electronic device connected thereto, and then select an image from the image sequence as an image to be identified.
  • the above-mentioned executing subject may select the image to be identified from the above-mentioned image sequence in various ways, for example, a random selection manner may be adopted; or, the first-ranked image may be selected from the image sequence.
  • a video is essentially an image sequence arranged in chronological order, so any video can correspond to an image sequence.
  • step 202 a face image is extracted from the image to be recognized, and the extracted face image is input to a first trained first recognition model to obtain a recognition result corresponding to the extracted face image.
  • the execution subject may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain the extracted The recognition result corresponding to the face image.
  • the recognition result may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio.
  • the recognition result may be used to characterize the category of the face corresponding to the face image.
  • the execution subject may extract a face image from the image to be identified in various ways.
  • a threshold segmentation method in an image segmentation technique may be used to segment a face image in an image to be identified from other image regions, and then extract a face image from the image to be identified.
  • the image segmentation technology is a well-known technology that is widely studied and applied at present, and will not be repeated here.
  • the above-mentioned execution subject may also extract a face image from the image to be identified by the following steps:
  • Step 2021 Input the to-be-recognized image into a pre-trained third recognition model, and obtain position information for characterizing a position of a face image in the to-be-recognized image in the to-be-recognized image.
  • the location information may include, but is not limited to, at least one of the following: text, numbers, symbols, and images.
  • the position information may be a quadrilateral image in which a face image is frame-selected in the image to be identified.
  • the third recognition model may be used to characterize the correspondence between the image to be identified including the face image and the position information used to characterize the position of the face image in the image to be identified.
  • the third recognition model may be a model obtained by training an initial model (such as a Convolutional Neural Network (CNN), a residual network (ResNet), etc.) based on training samples and using a machine learning method.
  • CNN Convolutional Neural Network
  • ResNet residual network
  • the third recognition model can be obtained by training as follows: first, a training sample set is obtained, where the training sample may include a sample to-be-recognized image including a sample face image, and a sample face image in the sample-to-be-recognized image is previously The labeled sample position information, where the sample position information can be used to characterize the position of the sample face image in the sample to-be-recognized image in the sample to-be-recognized image.
  • training samples can be selected from the training sample set and the following training steps are performed: inputting the sample to-be-recognized image of the selected training sample into the initial model, obtaining position information corresponding to the sample face image in the sample-to-be-recognized image; The sample position information corresponding to the input sample to-be-recognized image is used as the expected output of the initial model. Based on the obtained position information and sample position information, adjust the parameters of the initial model; determine whether there are unselected training samples in the training sample set; response In the absence of unselected training samples, the adjusted initial model is determined as the third recognition model.
  • the method may further include the following steps: in response to determining that there are unselected training samples, reselecting training samples from the unselected training samples, and using the most recently adjusted initial model as the new initial model, and continuing Perform the training steps described above.
  • the execution subject of the steps used to generate the model may be the same as or different from the execution subject of the method used to generate the information. If they are the same, the execution subject of the step for generating the model can store the trained model locally after the model is trained. If they are different, the execution body of the step for generating the model may send the trained model to the execution body of the method for generating information after training to obtain the model.
  • step 2022 a face image is extracted from the image to be identified based on the obtained position information.
  • the above-mentioned executing subject may extract a face image from the image to be identified in various ways.
  • a to-be-recognized image may be cropped based on the obtained position information to obtain a face image.
  • the execution subject may generate a recognition result corresponding to the face image.
  • the image to be recognized may include at least one face image.
  • the execution subject may input the face image into a pre-trained first recognition model to obtain the The recognition result corresponding to the face image.
  • the first recognition model may be used to characterize a correspondence between a face image and a recognition result corresponding to the face image.
  • the first recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, and the like) based on training samples and using a machine learning method.
  • an initial model such as a convolutional neural network, a residual network, and the like
  • the execution subject used to obtain the first recognition model may be trained to obtain the first recognition model by using a training method similar to the training method of the third recognition model, and specific training steps are not described herein again.
  • the training samples in the corresponding training sample set may include sample face images and sample recognition results pre-labeled for the sample face images, where the sample recognition results may be used for characterization The category of the face corresponding to the sample face image.
  • Step 203 Select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
  • the execution entity may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
  • the candidate recognition models in the candidate recognition model set may be pre-trained models for identifying different classes of faces to generate key point information.
  • the execution body may use various methods to select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model. For example, a technician may preset a correspondence relationship between the recognition result and the candidate recognition model in the candidate recognition model set (for example, a correspondence table) in the execution body, and further, the execution body may use the obtained recognition result to find the correspondence. Relationship to determine a candidate recognition model that matches the obtained recognition result as a second recognition model.
  • a technician can preset category information corresponding to the candidate recognition model, where the category information can be used to characterize the types of faces that the candidate recognition model can recognize.
  • the category information may include but is not limited to at least one of the following: numbers, text, symbols, pictures.
  • the above-mentioned execution body may match the obtained recognition result with the category information corresponding to the candidate recognition model in the candidate recognition model set (for example, perform similarity calculation) to determine a candidate that matches the obtained recognition result.
  • a recognition model (a candidate recognition model whose calculation result obtained by performing similarity calculation is greater than or equal to a preset threshold) is used as the second recognition model.
  • the candidate recognition model may be used to characterize a correspondence between a face image and keypoint information corresponding to the face image.
  • the keypoint information can be used to characterize the position of keypoints in the face image in the face image.
  • the key points of the face image may be points with obvious semantic discrimination, such as points used to characterize the nose, points used to characterize the eyes, and so on.
  • the candidate recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, etc.) based on training samples and using a machine learning method.
  • an initial model such as a convolutional neural network, a residual network, etc.
  • the execution subject used for training to obtain the candidate recognition model may be trained to obtain the candidate recognition model by using a training method similar to the training method of the third recognition model, and specific training steps are not described herein again.
  • the training samples in the corresponding training sample set may include sample face images and sample keypoint information pre-labeled for the sample face images, where The types of faces corresponding to the sample face images may be the same (for example, they are all cat faces or all dog faces).
  • the sample keypoint information can be used to characterize the position of keypoints in the sample face image in the sample face image.
  • the execution subject may extract at least one face image, and for each face image in the extracted face image, the execution subject may obtain a recognition result, and further, for the obtained For each recognition result in the recognition result, based on step 203, the execution subject may select a candidate recognition model as the second recognition model corresponding to the recognition result.
  • Step 204 Input the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image.
  • the execution subject may input the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image. It should be noted that, for the face image in the extracted face image, the execution subject may input the face image into a second recognition model corresponding to the face image to obtain key point information corresponding to the face image. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
  • the above-mentioned executing subject may also determine the keypoint information corresponding to the image to be identified in a backward inference manner, where the keypoint information corresponding to the image to be identified may be used It is used to characterize the positions of key points in the image to be identified (that is, key points included in the face image in the image to be identified) in the image to be identified.
  • the candidate recognition model may be used to characterize the correspondence between the face image and the keypoint information and the matching information corresponding to the face image
  • the matching information may include But not limited to at least one of the following: numbers, text, symbols, images, audio.
  • the matching information may include a matching index used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
  • the size of the matching index and the level of matching may have a corresponding relationship.
  • the corresponding relationship may be that the larger the matching index, the higher the matching degree; or the smaller the matching index, the higher the matching degree.
  • the training samples in the corresponding training sample set may include a sample face image and sample keypoint information and samples pre-labeled for the sample face image.
  • Matching information where sample keypoint information can be used to characterize the position of keypoints in the sample face image in the sample face image.
  • the sample matching information may include a sample matching index.
  • the sample matching index can be used to characterize the degree of sample matching between the category of the face corresponding to the input sample face image and the category of the face predetermined for the candidate recognition model.
  • the correspondence between the sample matching index and the degree of sample matching can be set in advance by a technician. For example, it can be set that the larger the sample matching index, the higher the degree of sample matching.
  • the execution subject may input the extracted face image into the second recognition model to obtain key point information and matching information corresponding to the extracted face image. Furthermore, through the above-mentioned second recognition model, the degree of matching between the input face image and the second recognition model can be determined, and matching information is generated, so that subsequent operations (such as re-selecting the second recognition model) based on the matching information can further improve Information processing accuracy.
  • the image to be identified is an image selected from an image sequence corresponding to the target video
  • the extracted face image is input to the selected second recognition model to obtain the following information generation steps:
  • an image located behind the image to be identified and adjacent to the image to be identified is selected as a candidate image to be identified from the image sequence.
  • a face image is extracted from the candidate to-be-recognized image as a candidate face image, and the extracted face image in the to-be-recognized image is determined as a reference face image.
  • the execution subject may use the above-mentioned method for extracting a face image for an image to be identified to extract a face image from a candidate image to be identified as a candidate face image, and details are not described herein again.
  • the preset condition may be used to limit the degree of matching between the category of the face corresponding to the reference face image and the category of the face corresponding to the second recognition model input to the reference face image.
  • a technician can set a matching threshold in advance.
  • the above preset condition may be that the matching index is greater than or equal to the matching threshold; when the corresponding relationship between the matching index and the degree of matching is the smaller the matching index.
  • the above preset condition may be that the matching index is less than or equal to the matching threshold.
  • the extracted candidate face image is input to the second recognition model input to the determined reference face image to obtain the extracted Key point information and matching information corresponding to the candidate face image of.
  • the matching information corresponding to the image located before the image and the second recognition model can be used to generate the image by using the method described in this implementation manner.
  • the specific steps can be referred to the above information generation steps, which will not be repeated here.
  • FIG. 3 is a schematic diagram of an application scenario of the method for generating information according to this embodiment.
  • the server 301 first obtains an image to be identified 303 sent by the terminal device 302, where the image to be identified 303 includes face images 3031 and 3032. Then, the server 301 extracts the face image 3031 and the face image 3032 from the to-be-recognized image 303, and inputs the face image 3031 and the face image 3032 into the first trained first recognition model 304, respectively, and obtains the recognition result corresponding to the face image 3031 " The recognition result "dog” 3052 corresponding to the "cat” 3051 and the face image 3032.
  • the server 301 may obtain a candidate recognition model set 306, where the candidate recognition model set 306 includes candidate recognition models 3061, 3062, and 3063.
  • the technician presets the correspondence between the candidate recognition model and the recognition result as follows: the candidate recognition model 3061 corresponds to the recognition result "cat”; the candidate recognition model 3062 corresponds to the recognition result "dog”; the candidate recognition model 3063 corresponds to the recognition result " Person ".
  • the server 301 may select a candidate recognition model 3061 that matches the obtained recognition result "cat" 3051 from the candidate recognition model set 306 as the second recognition model 3071 corresponding to the face image 3031; from the candidate recognition model set 306, A candidate recognition model 3062 that matches the obtained recognition result "dog” 3052 is selected as the second recognition model 3072 corresponding to the face image 3032.
  • the server 301 may input the face image 3031 into the second recognition model 3071, and obtain the key point information 3081 corresponding to the face image 3031. Enter the face image 3032 into the second recognition model 3072, and obtain the key point information 3082 corresponding to the face image 3032.
  • the key point information can be used to characterize the position of key points in the face image in the face image.
  • the method provided by the foregoing embodiments of the present application obtains a corresponding image of the extracted face image by acquiring an image to be identified, and then extracting a face image from the image to be identified, and inputting the extracted face image into a first trained first recognition model.
  • Recognition result wherein the recognition result is used to characterize the category of the face corresponding to the face image, and then a candidate recognition model matching the obtained recognition result is selected as a second recognition model from the candidate recognition model set, and finally the extracted
  • the face image is input to the selected second recognition model to obtain the key point information corresponding to the extracted face image.
  • the key point information is used to characterize the position of the key point in the face image in the face image, so that pre-training can be used.
  • the candidate recognition model for identifying different types of faces recognizes face images to generate key point information, which can identify different types of face images, improves the comprehensiveness of information generation, and uses the correspondence with face images. Recognition of candidate recognition models that match the categories, can provide accuracy of information generation
  • a flowchart 400 of yet another embodiment of a method for generating information is shown.
  • the process 400 of the method for generating information includes the following steps:
  • Step 401 Obtain an image to be identified.
  • an execution subject for example, a server shown in FIG. 1
  • the image to be identified may include a face image.
  • the face image included in the image to be identified may include an animal face image, and may also include a human face image.
  • the animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
  • step 401 may be implemented in a manner similar to step 201 in the foregoing embodiment.
  • step 201 is also applicable to step 401 of this embodiment, and details are not described herein again.
  • a face image is extracted from the image to be recognized, and the extracted face image is input into a first trained first recognition model to obtain a recognition result and reference keypoint information corresponding to the extracted face image.
  • the execution subject may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain the extracted
  • the recognition result and reference keypoint information corresponding to the face image may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio.
  • the recognition result can be used to characterize the category of the face corresponding to the face image.
  • the benchmark keypoint information may include but is not limited to at least one of the following: text, numbers.
  • the symbol, image, and reference keypoint information can be used to characterize the position of the reference keypoint in the face image in the face image.
  • the reference key point may be a point used to determine a key point in the face image, for example, a point where the nose tip of the nose is located, a point where the corner of the mouth is located, and the like.
  • the above-mentioned execution subject may use the face image extraction method in the embodiment corresponding to FIG. 2 to extract the face image from the image to be identified, and details are not described herein again.
  • the first recognition model in this embodiment may be used to characterize the correspondence between the recognition result corresponding to the face image and the face image and the reference keypoint information.
  • the training samples in the corresponding training sample set may include sample face images, sample recognition results and sample reference keypoint information pre-labeled for the sample face images.
  • the reference keypoint information of the sample can be used to characterize the position of the reference keypoint in the sample face image in the sample face image.
  • Step 403 Select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
  • the execution entity may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
  • the candidate recognition model in the candidate recognition model set may be a pre-trained model for recognizing faces of different classes to generate key point information.
  • the candidate recognition model can be used to characterize the correspondence between the face image and the reference keypoint information of the face image and the keypoint information of the face image.
  • the candidate recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, etc.) based on training samples and using a machine learning method.
  • the candidate recognition model can be obtained by training as follows. First, a training sample set is obtained, where the training sample may include a sample face image and sample keypoint information pre-labeled for the sample face image. Then, training samples can be selected from the training sample set and the following model training steps are performed: extracting the sample reference keypoint information corresponding to the sample face image of the selected training sample, wherein the sample reference keypoint information can be used to characterize the reference key The position of the point in the sample face image; the sample face image of the selected training sample and the extracted reference keypoint information of the sample are input into the initial model to obtain the keypoint information; the sample keypoint corresponding to the input sample face image The information is used as the expected output of the initial model. Based on the obtained keypoint information and sample keypoint information, adjust the parameters of the initial model; determine whether there are unselected training samples in the training sample set; respond to the absence of unselected training samples , Determine the adjusted initial model as a candidate recognition model.
  • the execution subject used to perform the above model training step can extract sample reference keypoint information corresponding to the sample face image in various ways.
  • the sample face image can be input to the first recognition model in step 402 of this embodiment to obtain the sample reference keypoint information corresponding to the sample face image; or the sample face image can be output and the user can be labeled for the sample face image The key information of the sample benchmark.
  • the method may further include the following steps: in response to determining that there are unselected training samples, reselecting training samples from the unselected training samples, and using the most recently adjusted initial model as the new initial model, and continuing Perform the model training steps described above.
  • the selection of the second recognition model in this embodiment may be implemented in a manner similar to the method for selecting the second recognition model in the embodiment corresponding to FIG. 2, and is not described herein again.
  • Step 404 Input the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain keypoint information corresponding to the extracted face image.
  • the execution subject may input the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain the extracted face image.
  • Corresponding key point information may be noted that, for the face image in the extracted face image, the execution subject may input the face image and the reference keypoint information corresponding to the face image into the second recognition model corresponding to the face image. Obtain keypoint information corresponding to the face image. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
  • the process 400 of the method for generating information in this embodiment highlights the reference keypoint information corresponding to the generated face image, and based on the generated Steps of generating key point information corresponding to a face image based on the key point information. Therefore, the solution described in this embodiment can use the reference keypoint information as a reference to generate more accurate keypoint information, which further improves the accuracy of information generation.
  • this application provides an embodiment of an apparatus for generating information.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 2.
  • the device can be specifically applied to various electronic devices.
  • the apparatus 500 for generating information in this embodiment includes an image acquisition unit 501, a first input unit 502, a model selection unit 503, and a second input unit 504.
  • the image acquisition unit 501 is configured to acquire an image to be identified, where the image to be identified includes a face image;
  • the first input unit 502 is configured to extract a face image from the image to be identified, and input the extracted face image into a pre-training
  • the first recognition model of the first recognition model obtains the recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the category of the face corresponding to the face image;
  • the model selection unit 503 is configured to select a candidate recognition model set from the set of candidate recognition models.
  • the candidate recognition model that matches the obtained recognition result is used as the second recognition model, wherein the candidate recognition model in the candidate recognition model set is a pre-trained model for identifying different classes of faces to generate key point information;
  • the second input The unit 504 is configured to input the extracted face image into the selected second recognition model to obtain keypoint information corresponding to the extracted face image, wherein the keypoint information is used to characterize the keypoints in the face image in the face image. Location.
  • the image acquisition unit 501 of the apparatus 500 for generating information may acquire an image to be identified in a wired connection manner or a wireless connection manner.
  • the image to be identified may include a face image.
  • the face image included in the image to be identified may include an animal face image, and may also include a human face image.
  • the animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
  • the first input unit 502 may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain Recognition result corresponding to the extracted face image.
  • the recognition result may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio.
  • the recognition result can be used to characterize the category of the face corresponding to the face image.
  • the image to be recognized may include at least one face image.
  • the first input unit 502 may input the face image into a first trained first recognition model.
  • a recognition result corresponding to the face image is obtained.
  • the first recognition model may be used to characterize a correspondence between a face image and a recognition result corresponding to the face image.
  • the model selection unit 503 may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
  • the candidate recognition model in the candidate recognition model set may be a pre-trained model for recognizing faces of different classes to generate key point information.
  • the candidate recognition model may be used to characterize a correspondence between a face image and keypoint information corresponding to the face image.
  • the keypoint information can be used to characterize the position of keypoints in the face image in the face image.
  • the key points of the face image may be points with obvious semantic discrimination, such as points used to characterize the nose, points used to characterize the eyes, and so on.
  • the second input unit 504 may input the extracted face image into the selected second recognition model to obtain a key corresponding to the extracted face image.
  • Point information may be noted that, for a face image in the extracted face image, the second input unit 504 may input the face image into a second recognition model corresponding to the face image, and obtain a key point corresponding to the face image. information. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
  • the first input unit 502 may be further configured to: input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image And reference keypoint information, where the reference keypoint information may be used to characterize the position of the reference keypoint in the face image in the face image; and the second input unit 504 may be further configured to: compare the extracted face image with the The obtained reference keypoint information is input into the selected second recognition model, and keypoint information corresponding to the extracted face image is obtained.
  • the first input unit 502 may include a first input module (not shown in the figure) configured to input an image to be identified into a pre-trained third recognition model to obtain Position information for characterizing the position of a face image in the image to be identified in the image to be identified; an image extraction module (not shown in the figure) configured to extract a face image from the image to be identified based on the obtained position information .
  • a first input module (not shown in the figure) configured to input an image to be identified into a pre-trained third recognition model to obtain Position information for characterizing the position of a face image in the image to be identified in the image to be identified
  • an image extraction module (not shown in the figure) configured to extract a face image from the image to be identified based on the obtained position information .
  • the second input unit 504 may be further configured to: input the extracted face image into the selected second recognition model to obtain the key points corresponding to the extracted face image Information and matching information, where the matching information may include a matching index used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
  • the image acquisition unit 501 may be further configured to: select an image to be identified from an image sequence corresponding to a target video, where the target video may be a place where a face is photographed. Get the video.
  • the apparatus 500 may further include: an image selecting unit (not shown in the figure), configured to select from the image sequence to be located behind the image to be identified and adjacent to the image to be identified Image as a candidate to-be-recognized image; an image determination unit (not shown in the figure) is configured to extract a face image from the candidate to-be-recognized image as a candidate face image, and determine the face image in the extracted, to-be-recognized image Is a reference face image; a condition determination unit (not shown in the figure) is configured to determine whether a matching index in the matching information corresponding to the determined reference face image meets a preset condition; a third input unit (not shown in the figure) Out), configured to, in response to determining yes, input the extracted candidate face image into the second reference model input to the determined reference face image, to obtain keypoint information and matching information corresponding to the extracted candidate face image.
  • an image selecting unit (not shown in the figure)
  • an image determination unit is configured to extract a face image from the candidate
  • the apparatus 500 obtained by the foregoing embodiment of the present application obtains an image to be identified through the image acquisition unit 501, and then the first input unit 502 extracts a face image from the image to be identified, and inputs the extracted face image into a pre-trained first recognition model.
  • the model selection unit 503 selects a candidate recognition model set that matches the obtained recognition result.
  • the candidate recognition model is used as the second recognition model.
  • the second input unit 504 inputs the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image.
  • the key point information is used for Characterize the position of key points in the face image in the face image, so that the face image can be identified using a candidate training model that is pre-trained to identify different types of faces to generate key point information, which can then identify different types of The face image improves the comprehensiveness of information generation, and uses the image corresponding to the face image. Other candidate recognition models to match the identified, the accuracy of the information generated may be provided.
  • FIG. 6 illustrates a schematic structural diagram of a computer system 600 suitable for implementing an electronic device (such as the terminal device / server shown in FIG. 1) in the embodiment of the present application.
  • the electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
  • the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608. Instead, perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read-only memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input / output (I / O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 608 including a hard disk and the like; a communication section 609 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I / O interface 605 as necessary.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as needed, so that a computer program read therefrom is installed into the storage section 608 as needed.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611.
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present application may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an image acquisition unit, a first input unit, a model selection unit, and a second input unit.
  • a processor includes an image acquisition unit, a first input unit, a model selection unit, and a second input unit.
  • the names of these units do not constitute a limitation on the unit itself in some cases, for example, the image acquisition unit may also be described as a “unit for acquiring an image to be identified”.
  • the present application also provides a computer-readable medium, which may be included in the electronic device described in the foregoing embodiments; or may exist alone without being assembled into the electronic device in.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by the electronic device, the electronic device: obtains an image to be identified, wherein the image to be identified includes a face image; Extracting a face image from the face, and inputting the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize a category of a face corresponding to the face image; from From the candidate recognition model set, a candidate recognition model that matches the obtained recognition result is selected as the second recognition model.
  • the candidate recognition model in the candidate recognition model set is a pre-trained, used to recognize faces of different categories to generate a key.
  • Model of point information inputting the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize the key points in the face image in the face image Location.

Abstract

Disclosed by embodiments of the present application are a method and device used for generating information. One specific manner of embodying the method comprises: acquiring an image to be identified; extracting a facial image from the image to be identified, and inputting the extracted facial image to a pre-trained first identification model to obtain an identification result corresponding to the extracted facial image, wherein the identification result is used for characterizing the face type corresponding to the facial image; choosing a candidate identification model that matches the obtained identification result from a candidate identification model set to serve as a second identification model, wherein the candidate identification model in the candidate identification model set is a pre-trained model used for identifying different types of faces so as to generate key point information; inputting the extracted facial image to the second identification model to obtain the key point information corresponding to the extracted facial image, wherein the key point information is used for characterizing the position in the facial image of a key point in the facial image. The manner of embodiment increases the comprehensiveness and accuracy of information generation.

Description

用于生成信息的方法和装置Method and device for generating information
本专利申请要求于2018年7月27日提交的、申请号为201810846313.7、申请人为北京字节跳动网络技术有限公司、发明名称为“用于生成信息的方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。This patent application claims the priority of a Chinese patent application filed on July 27, 2018, with application number 201810846313.7, the applicant being Beijing BYTE Network Technology Co., Ltd., and the invention name "Method and Device for Generating Information" The entire application is incorporated herein by reference.
技术领域Technical field
本申请实施例涉及计算机技术领域,尤其涉及用于生成信息的方法和装置。Embodiments of the present application relate to the field of computer technology, and in particular, to a method and an apparatus for generating information.
背景技术Background technique
脸的关键点指的是脸中具有明显语义区分度的点,例如鼻子所对应的点、眼睛所对应的点等。The key points of the face refer to the points in the face that have obvious semantic discrimination, such as the points corresponding to the nose and the points corresponding to the eyes.
目前,对于脸的关键点的检测通常指的是针对人脸关键点的检测。通过对人脸关键点进行检测,可以实现特效添加、人脸三维建模、美颜拍照等功能。At present, the detection of key points on a face usually refers to the detection of key points on a face. By detecting the key points of the face, functions such as adding special effects, 3D modeling of the face, and taking pictures of the beauty can be realized.
发明内容Summary of the Invention
本申请实施例提出了用于生成信息的方法和装置。The embodiments of the present application provide a method and device for generating information.
第一方面,本申请实施例提供了一种用于生成信息的方法,该方法包括:获取待识别图像,其中,待识别图像包括脸图像;从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;将所提取的脸图像输入所选取的第二识别 模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。In a first aspect, an embodiment of the present application provides a method for generating information. The method includes: acquiring an image to be identified, where the image to be identified includes a face image; extracting a face image from the image to be identified, and extracting the extracted image. Input the pre-trained first recognition model of the face image to obtain the recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the category of the face corresponding to the face image; select and obtain from a candidate recognition model set The candidate recognition model that matches the recognition result of the second recognition model is used as the second recognition model. The candidate recognition model in the candidate recognition model set is a pre-trained model for identifying different classes of faces to generate key point information. The face image is input to the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize the position of the key point in the face image in the face image.
在一些实施例中,将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,包括:将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息,其中,基准关键点信息用于表征脸图像中的基准关键点在脸图像中的位置;以及将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,包括:将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。In some embodiments, inputting the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image includes: inputting the extracted face image into a pre-trained first recognition model, Obtain the recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference keypoint information is used to characterize the position of the reference keypoint in the face image in the face image; and input the extracted face image into the selected Obtaining the keypoint information corresponding to the extracted face image includes: inputting the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain the extracted face image Corresponding key point information.
在一些实施例中,从待识别图像中提取脸图像,包括:将待识别图像输入预先训练的第三识别模型,获得用于表征待识别图像中的脸图像在待识别图像中的位置的位置信息;基于所获得的位置信息,从待识别图像中提取脸图像。In some embodiments, extracting a face image from an image to be identified includes: inputting the image to be identified into a pre-trained third recognition model, and obtaining a position for characterizing a position of a face image in the image to be identified in the image to be identified Information; based on the obtained position information, a face image is extracted from the image to be identified.
在一些实施例中,将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,包括:将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息,其中,匹配信息包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。In some embodiments, inputting the extracted face image into the selected second recognition model, and obtaining key point information corresponding to the extracted face image, includes: inputting the extracted face image into the selected second recognition model. To obtain key point information and matching information corresponding to the extracted face image, where the matching information includes a degree of matching used to characterize the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model Match index.
在一些实施例中,获取待识别图像,包括:从目标视频所对应的图像序列中选取图像作为待识别图像,其中,目标视频为对脸进行拍摄所获得的视频。In some embodiments, obtaining the image to be identified includes: selecting an image to be identified from an image sequence corresponding to the target video, where the target video is a video obtained by photographing a face.
在一些实施例中,在将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息之后,该方法还包括:从图像序列中选取位于待识别图像之后且与待识别图像相邻的图像作为候选待识别图像;从候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像;确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件;响应于确定是,将所提取的候选脸图像输入所确定的 基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。In some embodiments, after inputting the extracted face image into the selected second recognition model, and obtaining keypoint information and matching information corresponding to the extracted face image, the method further includes: selecting an image sequence located in the image sequence. The image after the image to be recognized and adjacent to the image to be recognized is a candidate to be recognized image; a face image is extracted from the candidate to be recognized image as a candidate face image, and the extracted face image in the image to be recognized is used as a reference face Image; determining whether the matching index in the matching information corresponding to the determined reference face image meets a preset condition; and in response to determining yes, inputting the extracted candidate face image into the second reference model entered in the determined reference face image To obtain keypoint information and matching information corresponding to the extracted candidate face image.
第二方面,本申请实施例提供了一种用于生成信息的装置,该装置包括:图像获取单元,被配置成获取待识别图像,其中,待识别图像包括脸图像;第一输入单元,被配置成从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;模型选取单元,被配置成从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;第二输入单元,被配置成将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。In a second aspect, an embodiment of the present application provides an apparatus for generating information. The apparatus includes: an image acquiring unit configured to acquire an image to be identified, where the image to be identified includes a face image; and a first input unit, which is Configured to extract a face image from an image to be recognized, and input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the corresponding face image The type of face; the model selection unit is configured to select a candidate recognition model matching the obtained recognition result from the candidate recognition model set as the second recognition model, wherein the candidate recognition model in the candidate recognition model set is a A trained model for identifying different types of faces to generate key point information; a second input unit configured to input the extracted face image into the selected second recognition model to obtain a corresponding face image of the extracted face Keypoint information, where the keypoint information is used to characterize the position of a keypoint in the face image in the face image.
在一些实施例中,第一输入单元进一步被配置成:将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息,其中,基准关键点信息用于表征脸图像中的基准关键点在脸图像中的位置;以及第二输入单元进一步被配置成:将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。In some embodiments, the first input unit is further configured to: input the extracted face image into a pre-trained first recognition model to obtain a recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference The key point information is used to characterize the position of the reference key point in the face image in the face image; and the second input unit is further configured to: input the extracted face image and the obtained reference key point information into the selected second Recognize the model and obtain key point information corresponding to the extracted face image.
在一些实施例中,第一输入单元包括:第一输入模块,被配置成将待识别图像输入预先训练的第三识别模型,获得用于表征待识别图像中的脸图像在待识别图像中的位置的位置信息;图像提取模块,被配置成基于所获得的位置信息,从待识别图像中提取脸图像。In some embodiments, the first input unit includes: a first input module configured to input an image to be identified into a pre-trained third recognition model, to obtain a characterization of a face image in the image to be identified in the image to be identified; The location information of the location; the image extraction module is configured to extract a face image from the image to be identified based on the obtained location information.
在一些实施例中,第二输入单元进一步被配置成:将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息,其中,匹配信息包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。In some embodiments, the second input unit is further configured to input the extracted face image into the selected second recognition model to obtain keypoint information and matching information corresponding to the extracted face image, wherein the matching information A matching index is used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
在一些实施例中,图像获取单元进一步被配置成:从目标视频所 对应的图像序列中选取图像作为待识别图像,其中,目标视频为对脸进行拍摄所获得的视频。In some embodiments, the image acquisition unit is further configured to select an image to be identified from an image sequence corresponding to the target video, where the target video is a video obtained by photographing a face.
在一些实施例中,该装置还包括:图像选取单元,被配置成从图像序列中选取位于待识别图像之后且与待识别图像相邻的图像作为候选待识别图像;图像确定单元,被配置成从候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像;条件确定单元,被配置成确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件;第三输入单元,被配置成响应于确定是,将所提取的候选脸图像输入所确定的基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。In some embodiments, the apparatus further includes: an image selection unit configured to select an image located behind the image to be identified and adjacent to the image to be identified from the image sequence as a candidate image to be identified; and the image determination unit is configured to Extracting a face image from a candidate to-be-recognized image as a candidate face image, and determining the extracted face image in the to-be-recognized image as a reference face image; a condition determining unit configured to determine a corresponding one of the determined reference face image Whether the matching index in the matching information meets a preset condition; the third input unit is configured to, in response to determining yes, input the extracted candidate face image into the second reference model input to the determined reference face image to obtain the extracted Key point information and matching information corresponding to the candidate face image of.
第三方面,本申请实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序,当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现上述用于生成信息的方法中任一实施例的方法。According to a third aspect, an embodiment of the present application provides an electronic device including: one or more processors; a storage device that stores one or more programs thereon; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method of any one of the foregoing methods for generating information.
第四方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现上述用于生成信息的方法中任一实施例的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method of any one of the foregoing methods for generating information.
本申请实施例提供的用于生成信息的方法和装置,通过获取待识别图像,然后从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别,接着从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,最后将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置,从而可以利用预先训练的、用于识别不同类别的脸的候选识别模型对脸图像进行识别,以生成关键点信息,进而可以识别不同类别的脸图像,提高了信息生成的全面性,并且,利用与脸图像所对应的类别相匹配的候选识别模型进行识别,可以提供信息生成的准确性。The method and device for generating information provided in the embodiments of the present application obtain an extracted image by acquiring an image to be identified, and then extracting a face image from the image to be identified, and inputting the extracted face image into a pre-trained first recognition model. The recognition result corresponding to the face image, wherein the recognition result is used to characterize the category of the face corresponding to the face image, and then a candidate recognition model matching the obtained recognition result is selected from the candidate recognition model set as the second recognition model , And finally input the extracted face image into the selected second recognition model to obtain keypoint information corresponding to the extracted face image, wherein the keypoint information is used to characterize the position of the keypoint in the face image in the face image , So that pre-trained candidate recognition models for identifying different types of faces can be used to identify face images to generate key point information, and then different types of face images can be identified, improving the comprehensiveness of information generation, and, Recognition using a candidate recognition model that matches the category corresponding to the face image can provide The accuracy of the information generated.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects, and advantages of the present application will become more apparent by reading the detailed description of the non-limiting embodiments with reference to the following drawings:
图1是本申请的一个实施例可以应用于其中的示例性系统架构图;FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied; FIG.
图2是根据本申请的用于生成信息的方法的一个实施例的流程图;2 is a flowchart of an embodiment of a method for generating information according to the present application;
图3是根据本申请实施例的用于生成信息的方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of a method for generating information according to an embodiment of the present application;
图4是根据本申请的用于生成信息的方法的又一个实施例的流程图;4 is a flowchart of still another embodiment of a method for generating information according to the present application;
图5是根据本申请的用于生成信息的装置的一个实施例的结构示意图;5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to the present application;
图6是适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The following describes the present application in detail with reference to the accompanying drawings and embodiments. It can be understood that the specific embodiments described herein are only used to explain the related invention, rather than limiting the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The application will be described in detail below with reference to the drawings and embodiments.
图1示出了可以应用本申请的用于生成信息的方法或用于生成信息的装置的实施例的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 of an embodiment of a method for generating information or an apparatus for generating information to which the present application can be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类 型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. The network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various types of connections, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如图像处理类应用、美图软件、网页浏览器应用、搜索类应用、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as image processing applications, Meitu software, web browser applications, search applications, and social platform software.
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. When the terminal device 101, 102, 103 is hardware, it can be various electronic devices with a display screen, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Pictures Experts Group Audio Layer III, Motion picture expert compression standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer 4), player, laptop portable computer and desktop computer, etc. When the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (such as multiple software or software modules used to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103发送的待识别图像进行处理的图像处理服务器。图像处理服务器可以对接收到的待识别图像等数据进行分析等处理,并获得处理结果(例如关键点信息)。The server 105 may be a server that provides various services, for example, an image processing server that processes an image to be identified sent by the terminal devices 101, 102, and 103. The image processing server may perform analysis and processing on the received data such as the image to be identified, and obtain a processing result (for example, key point information).
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster consisting of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, multiple software or software modules used to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。在待识别图像或者生成关键点信息的过程中所使用的数据不需要从远程获取的情况下,上述系统架构可以不包括网络,而只包括终端设备或服务器。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers. In the case that the data used in the process of identifying an image or generating keypoint information does not need to be obtained from a remote place, the above-mentioned system architecture may not include a network, but only a terminal device or a server.
继续参考图2,示出了根据本申请的用于生成信息的方法的一个实施例的流程200。该用于生成信息的方法,包括以下步骤:With continued reference to FIG. 2, a flowchart 200 of one embodiment of a method for generating information according to the present application is shown. The method for generating information includes the following steps:
步骤201,获取待识别图像。Step 201: Obtain an image to be identified.
在本实施例中,用于生成信息的方法的执行主体(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式获取待识别图像。其中,待识别图像可以包括脸图像。具体的,待识别图像所包括的脸图像可以包括动物脸图像,也可以包括人脸图像。动物脸图像所对应的动物脸可以为各种类别的动物脸,例如狗脸、猫脸等。In this embodiment, an execution subject (for example, a server shown in FIG. 1) of the method for generating information may obtain an image to be identified through a wired connection method or a wireless connection method. The image to be identified may include a face image. Specifically, the face image included in the image to be identified may include an animal face image, and may also include a human face image. The animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
需要说明的是,上述执行主体可以获取本地预先存储的待识别图像,也可以获取与之通信连接的电子设备(例如图1所示的终端设备)发送的待识别图像。It should be noted that the above-mentioned execution subject may obtain the image to be identified stored in advance locally, or may obtain the image to be identified sent by an electronic device (such as the terminal device shown in FIG. 1) communicatively connected thereto.
在本实施例的一些可选的实现方式中,上述执行主体可以从目标视频所对应的图像序列中选取图像作为待识别图像,其中,目标视频为可以为对脸进行拍摄所获得的视频。具体的,上述执行主体可以首先从本地或与之通信连接的电子设备获取目标视频所对应的图像序列,然后从图像序列中选取图像作为待识别图像。在这里,需要说明的是,上述执行主体可以采用各种方式从上述图像序列中选取待识别图像,例如,可以采用随机选取的方式;或者,可以从图像序列中选取排序在第一位的图像作为待识别图像。另外,可以理解的是,视频实质上是一个按照时间的先后顺序排列的图像序列,故任意一个视频都可以对应一个图像序列。In some optional implementation manners of this embodiment, the execution subject may select an image to be identified from an image sequence corresponding to a target video, where the target video is a video that can be obtained by shooting a face. Specifically, the execution subject may first obtain an image sequence corresponding to the target video from a local or electronic device connected thereto, and then select an image from the image sequence as an image to be identified. Here, it should be noted that the above-mentioned executing subject may select the image to be identified from the above-mentioned image sequence in various ways, for example, a random selection manner may be adopted; or, the first-ranked image may be selected from the image sequence. As the image to be identified. In addition, it can be understood that a video is essentially an image sequence arranged in chronological order, so any video can correspond to an image sequence.
步骤202,从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果。In step 202, a face image is extracted from the image to be recognized, and the extracted face image is input to a first trained first recognition model to obtain a recognition result corresponding to the extracted face image.
在本实施例中,基于步骤201中得到的待识别图像,上述执行主体可以首先从待识别图像中提取脸图像,而后将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果。其中,识别结果可以包括但不限于以下至少一项:文字、数字、符号、图像、音频,识别结果可以用于表征脸图像所对应的脸的类别。In this embodiment, based on the to-be-recognized image obtained in step 201, the execution subject may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain the extracted The recognition result corresponding to the face image. The recognition result may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio. The recognition result may be used to characterize the category of the face corresponding to the face image.
具体的,上述执行主体可以通过各种方式从待识别图像中提取脸 图像。例如,可以采用图像分割技术中的阈值分割方法将待识别图像中的脸图像与其他图像区域分割开来,进而从待识别图像中提取出脸图像。需要说明的是,图像分割技术是目前广泛研究和应用的公知技术,此处不再赘述。Specifically, the execution subject may extract a face image from the image to be identified in various ways. For example, a threshold segmentation method in an image segmentation technique may be used to segment a face image in an image to be identified from other image regions, and then extract a face image from the image to be identified. It should be noted that the image segmentation technology is a well-known technology that is widely studied and applied at present, and will not be repeated here.
在本实施例的一些可选的实现方式中,上述执行主体也可以通过如下步骤从待识别图像中提取脸图像:In some optional implementations of this embodiment, the above-mentioned execution subject may also extract a face image from the image to be identified by the following steps:
步骤2021,将待识别图像输入预先训练的第三识别模型,获得用于表征待识别图像中的脸图像在待识别图像中的位置的位置信息。Step 2021: Input the to-be-recognized image into a pre-trained third recognition model, and obtain position information for characterizing a position of a face image in the to-be-recognized image in the to-be-recognized image.
其中,位置信息可以包括但不限于以下至少一项:文字、数字、符号、图像。作为示例,位置信息可以为在待识别图像中框选出脸图像的四边形图像。The location information may include, but is not limited to, at least one of the following: text, numbers, symbols, and images. As an example, the position information may be a quadrilateral image in which a face image is frame-selected in the image to be identified.
第三识别模型可以用于表征包括脸图像的待识别图像与用于表征脸图像在待识别图像中的位置的位置信息的对应关系。具体的,第三识别模型可以为基于训练样本,利用机器学习方法,对初始模型(例如卷积神经网络(Convolutional Neural Network,CNN)、残差网络(ResNet)等)进行训练后得到的模型。The third recognition model may be used to characterize the correspondence between the image to be identified including the face image and the position information used to characterize the position of the face image in the image to be identified. Specifically, the third recognition model may be a model obtained by training an initial model (such as a Convolutional Neural Network (CNN), a residual network (ResNet), etc.) based on training samples and using a machine learning method.
作为示例,上述第三识别模型可以通过如下步骤训练得到:首先,获取训练样本集,其中,训练样本可以包括包括样本脸图像的样本待识别图像,以及针对样本待识别图像中的样本脸图像预先标注的样本位置信息,其中,样本位置信息可以用于表征样本待识别图像中的样本脸图像在样本待识别图像中的位置。然后,可以从训练样本集中选取训练样本,并执行以下训练步骤:将所选取的训练样本的样本待识别图像输入初始模型,获得样本待识别图像中的样本脸图像所对应的位置信息;将所输入的样本待识别图像所对应的样本位置信息作为初始模型的期望输出,基于所获得的位置信息和样本位置信息,调整初始模型的参数;确定训练样本集中是否存在未被选取的训练样本;响应于不存在未被选取的训练样本,将调整后的初始模型确定为第三识别模型。As an example, the third recognition model can be obtained by training as follows: first, a training sample set is obtained, where the training sample may include a sample to-be-recognized image including a sample face image, and a sample face image in the sample-to-be-recognized image is previously The labeled sample position information, where the sample position information can be used to characterize the position of the sample face image in the sample to-be-recognized image in the sample to-be-recognized image. Then, training samples can be selected from the training sample set and the following training steps are performed: inputting the sample to-be-recognized image of the selected training sample into the initial model, obtaining position information corresponding to the sample face image in the sample-to-be-recognized image; The sample position information corresponding to the input sample to-be-recognized image is used as the expected output of the initial model. Based on the obtained position information and sample position information, adjust the parameters of the initial model; determine whether there are unselected training samples in the training sample set; response In the absence of unselected training samples, the adjusted initial model is determined as the third recognition model.
在该示例中,还可以包括以下步骤:响应于确定存在未被选取的训练样本,从未被选取的训练样本中重新选取训练样本,以及将最近 一次调整的初始模型作为新的初始模型,继续执行上述训练步骤。In this example, the method may further include the following steps: in response to determining that there are unselected training samples, reselecting training samples from the unselected training samples, and using the most recently adjusted initial model as the new initial model, and continuing Perform the training steps described above.
需要说明的是,实践中,用于生成模型的步骤的执行主体可以与用于生成信息的方法的执行主体相同或者不同。如果相同,则用于生成模型的步骤的执行主体可以在训练得到模型后将训练好的模型存储在本地。如果不同,则用于生成模型的步骤的执行主体可以在训练得到模型后将训练好的模型发送给用于生成信息的方法的执行主体。It should be noted that, in practice, the execution subject of the steps used to generate the model may be the same as or different from the execution subject of the method used to generate the information. If they are the same, the execution subject of the step for generating the model can store the trained model locally after the model is trained. If they are different, the execution body of the step for generating the model may send the trained model to the execution body of the method for generating information after training to obtain the model.
步骤2022,基于所获得的位置信息,从待识别图像中提取脸图像。In step 2022, a face image is extracted from the image to be identified based on the obtained position information.
在这里,基于所获得的位置信息,上述执行主体可以采用各种方式从待识别图像中提取脸图像。例如,可以基于所获得的位置信息,对待识别图像进行裁剪,获得脸图像。Here, based on the obtained position information, the above-mentioned executing subject may extract a face image from the image to be identified in various ways. For example, a to-be-recognized image may be cropped based on the obtained position information to obtain a face image.
对于提取出的脸图像,上述执行主体可以生成脸图像所对应的识别结果。需要说明的是,在本实施例中,待识别图像可以包括至少一个脸图像,对于至少一个脸图像中的脸图像,上述执行主体可以将该脸图像输入预先训练的第一识别模型,获得该脸图像所对应的识别结果。其中,第一识别模型可以用于表征脸图像与脸图像所对应的识别结果的对应关系。For the extracted face image, the execution subject may generate a recognition result corresponding to the face image. It should be noted that, in this embodiment, the image to be recognized may include at least one face image. For the face image in the at least one face image, the execution subject may input the face image into a pre-trained first recognition model to obtain the The recognition result corresponding to the face image. The first recognition model may be used to characterize a correspondence between a face image and a recognition result corresponding to the face image.
具体的,第一识别模型可以为基于训练样本,利用机器学习方法,对初始模型(例如卷积神经网络、残差网络等)进行训练后得到的模型。Specifically, the first recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, and the like) based on training samples and using a machine learning method.
可以理解的是,用于训练获得上述第一识别模型的执行主体可以采用与上述第三识别模型的训练方式相似的训练方式训练获得上述第一识别模型,具体的训练步骤此处不再赘述。需要特别说明的是,对于第一识别模型的训练,其所对应的训练样本集中的训练样本可以包括样本脸图像和针对样本脸图像预先标注的样本识别结果,其中,样本识别结果可以用于表征样本脸图像所对应的脸的类别。It can be understood that the execution subject used to obtain the first recognition model may be trained to obtain the first recognition model by using a training method similar to the training method of the third recognition model, and specific training steps are not described herein again. It should be particularly noted that, for the training of the first recognition model, the training samples in the corresponding training sample set may include sample face images and sample recognition results pre-labeled for the sample face images, where the sample recognition results may be used for characterization The category of the face corresponding to the sample face image.
步骤203,从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。Step 203: Select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
在本实施例中,基于步骤202中得到的识别结果,上述执行主体可以从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。其中,候选识别模型集合中的候选识别模 型可以为预先训练的、用于识别不同类别的脸以生成关键点信息的模型。In this embodiment, based on the recognition result obtained in step 202, the execution entity may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model. Among them, the candidate recognition models in the candidate recognition model set may be pre-trained models for identifying different classes of faces to generate key point information.
具体的,上述执行主体可以采用各种方法从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。例如,技术人员可以在上述执行主体中预先设置识别结果与候选识别模型集合中的候选识别模型的对应关系(例如对应关系表),进而,上述执行主体可以利用所获得的识别结果,查找上述对应关系,以确定出与所获得的识别结果相匹配的候选识别模型作为第二识别模型。Specifically, the execution body may use various methods to select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model. For example, a technician may preset a correspondence relationship between the recognition result and the candidate recognition model in the candidate recognition model set (for example, a correspondence table) in the execution body, and further, the execution body may use the obtained recognition result to find the correspondence. Relationship to determine a candidate recognition model that matches the obtained recognition result as a second recognition model.
或者,对于候选识别模型集合中的候选识别模型,技术人员可以预先设置该候选识别模型所对应的类别信息,其中,类别信息可以用于表征该候选识别模型所能识别的脸的类别。类别信息可以包括但不限于以下至少一项:数字、文字、符号、图片。进而,上述执行主体可以对所获得的识别结果和候选识别模型集合中的候选识别模型所对应的类别信息进行匹配(例如进行相似度计算),以确定出与所获得的识别结果相匹配的候选识别模型(进行相似度计算所得到的计算结果大于等于预设阈值的候选识别模型)作为第二识别模型。Alternatively, for a candidate recognition model in the candidate recognition model set, a technician can preset category information corresponding to the candidate recognition model, where the category information can be used to characterize the types of faces that the candidate recognition model can recognize. The category information may include but is not limited to at least one of the following: numbers, text, symbols, pictures. Furthermore, the above-mentioned execution body may match the obtained recognition result with the category information corresponding to the candidate recognition model in the candidate recognition model set (for example, perform similarity calculation) to determine a candidate that matches the obtained recognition result. A recognition model (a candidate recognition model whose calculation result obtained by performing similarity calculation is greater than or equal to a preset threshold) is used as the second recognition model.
在本实施例中,候选识别模型可以用于表征脸图像与脸图像所对应的关键点信息的对应关系。其中,关键点信息可以用于表征脸图像中的关键点在脸图像中的位置。脸图像的关键点可以为具有明显语义区分度的点,例如用于表征鼻子的点、用于表征眼睛的点等。In this embodiment, the candidate recognition model may be used to characterize a correspondence between a face image and keypoint information corresponding to the face image. The keypoint information can be used to characterize the position of keypoints in the face image in the face image. The key points of the face image may be points with obvious semantic discrimination, such as points used to characterize the nose, points used to characterize the eyes, and so on.
具体的,候选识别模型可以为基于训练样本,利用机器学习方法,对初始模型(例如卷积神经网络、残差网络等)进行训练后得到的模型。Specifically, the candidate recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, etc.) based on training samples and using a machine learning method.
可以理解的是,用于训练获得上述候选识别模型的执行主体可以采用与上述第三识别模型的训练方式相似的训练方式训练获得上述候选识别模型,具体的训练步骤此处不再赘述。需要特别说明的是,对于候选识别模型集合中的每个候选识别模型的训练,其所对应的训练样本集中的训练样本可以包括样本脸图像和针对样本脸图像预先标注的样本关键点信息,其中,样本脸图像所对应的脸的类别可以相同(例如均为猫脸,或均为狗脸)。样本关键点信息可以用于表征样本脸图像 中的关键点在样本脸图像中的位置。It can be understood that the execution subject used for training to obtain the candidate recognition model may be trained to obtain the candidate recognition model by using a training method similar to the training method of the third recognition model, and specific training steps are not described herein again. It should be particularly noted that, for the training of each candidate recognition model in the candidate recognition model set, the training samples in the corresponding training sample set may include sample face images and sample keypoint information pre-labeled for the sample face images, where The types of faces corresponding to the sample face images may be the same (for example, they are all cat faces or all dog faces). The sample keypoint information can be used to characterize the position of keypoints in the sample face image in the sample face image.
还需要说明的是,基于步骤202,上述执行主体可以提取出至少一个脸图像,而针对提取出的脸图像中的每个脸图像,上述执行主体可以获得一个识别结果,进而,对于所获得的识别结果中的每个识别结果,基于步骤203,上述执行主体可以选取一个候选识别模型作为该识别结果所对应的第二识别模型。It should also be noted that based on step 202, the execution subject may extract at least one face image, and for each face image in the extracted face image, the execution subject may obtain a recognition result, and further, for the obtained For each recognition result in the recognition result, based on step 203, the execution subject may select a candidate recognition model as the second recognition model corresponding to the recognition result.
步骤204,将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。Step 204: Input the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image.
在本实施例中,基于步骤203所选取的第二识别模型,上述执行主体可以将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。需要说明的是,在这里,对于所提取的脸图像中的脸图像,上述执行主体可以将该脸图像输入该脸图像所对应的第二识别模型,获得该脸图像所对应的关键点信息。可以理解,脸图像与第二识别模型的对应关系可以由脸图像所对应的识别结果与第二识别模型的对应关系确定。In this embodiment, based on the second recognition model selected in step 203, the execution subject may input the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image. It should be noted that, for the face image in the extracted face image, the execution subject may input the face image into a second recognition model corresponding to the face image to obtain key point information corresponding to the face image. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
实践中,在生成脸图像所对应的关键点信息之后,上述执行主体还可以采用逆推的方式确定上述待识别图像所对应的关键点信息,其中,待识别图像所对应的关键点信息可以用于表征待识别图像中的关键点(即待识别图像中的脸图像所包括的关键点)在待识别图像中的位置。In practice, after generating the keypoint information corresponding to the face image, the above-mentioned executing subject may also determine the keypoint information corresponding to the image to be identified in a backward inference manner, where the keypoint information corresponding to the image to be identified may be used It is used to characterize the positions of key points in the image to be identified (that is, key points included in the face image in the image to be identified) in the image to be identified.
在本实施例的一些可选的实现方式中,候选识别模型(第二识别模型)可以用于表征脸图像与脸图像所对应的关键点信息和匹配信息的对应关系,其中,匹配信息可以包括但不限于以下至少一项:数字、文字、符号、图像、音频。具体的,匹配信息可以包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。In some optional implementations of this embodiment, the candidate recognition model (second recognition model) may be used to characterize the correspondence between the face image and the keypoint information and the matching information corresponding to the face image, where the matching information may include But not limited to at least one of the following: numbers, text, symbols, images, audio. Specifically, the matching information may include a matching index used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
在这里,匹配指数的大小与匹配程度的高低可以具有对应关系。具体的,对应关系可以为匹配指数越大,匹配程度越高;或者为匹配指数越小,匹配程度越高。Here, the size of the matching index and the level of matching may have a corresponding relationship. Specifically, the corresponding relationship may be that the larger the matching index, the higher the matching degree; or the smaller the matching index, the higher the matching degree.
需要说明的是,对于本实现方式中的候选识别模型(第二识别模 型),其所对应的训练样本集中的训练样本可以包括样本脸图像和针对样本脸图像预先标注的样本关键点信息和样本匹配信息,其中,样本关键点信息可以用于表征样本脸图像中的关键点在样本脸图像中的位置。样本匹配信息可以包括样本匹配指数。样本匹配指数可以用于表征所输入的样本脸图像所对应的脸的类别与针对候选识别模型预先确定的脸的类别的样本匹配程度。在这里,样本匹配指数与样本匹配程度的对应关系可以由技术人员预先设置。例如,可以设置为样本匹配指数越大,样本匹配程度越高。It should be noted that, for the candidate recognition model (second recognition model) in this implementation manner, the training samples in the corresponding training sample set may include a sample face image and sample keypoint information and samples pre-labeled for the sample face image. Matching information, where sample keypoint information can be used to characterize the position of keypoints in the sample face image in the sample face image. The sample matching information may include a sample matching index. The sample matching index can be used to characterize the degree of sample matching between the category of the face corresponding to the input sample face image and the category of the face predetermined for the candidate recognition model. Here, the correspondence between the sample matching index and the degree of sample matching can be set in advance by a technician. For example, it can be set that the larger the sample matching index, the higher the degree of sample matching.
在本实现方式中,基于上述第二识别模型,上述执行主体可以将所提取的脸图像输入第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息。进而,通过上述第二识别模型,可以确定所输入的脸图像和第二识别模型的匹配程度,生成匹配信息,以便基于匹配信息,执行后续操作(例如重新选取第二识别模型),可以进一步提高信息处理的准确性。In this implementation manner, based on the second recognition model, the execution subject may input the extracted face image into the second recognition model to obtain key point information and matching information corresponding to the extracted face image. Furthermore, through the above-mentioned second recognition model, the degree of matching between the input face image and the second recognition model can be determined, and matching information is generated, so that subsequent operations (such as re-selecting the second recognition model) based on the matching information can further improve Information processing accuracy.
在本实施例的一些可选的实现方式中,当待识别图像为从目标视频所对应的图像序列中选取的图像时,在将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息之后,上述执行主体还可以执行以下信息生成步骤:In some optional implementations of this embodiment, when the image to be identified is an image selected from an image sequence corresponding to the target video, the extracted face image is input to the selected second recognition model to obtain the After extracting the key point information and the matching information corresponding to the face image, the above-mentioned execution subject may also perform the following information generation steps:
首先,从上述图像序列中选取位于待识别图像之后且与待识别图像相邻的图像作为候选待识别图像。First, an image located behind the image to be identified and adjacent to the image to be identified is selected as a candidate image to be identified from the image sequence.
然后,从候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像。Then, a face image is extracted from the candidate to-be-recognized image as a candidate face image, and the extracted face image in the to-be-recognized image is determined as a reference face image.
在这里,上述执行主体可以采用上述针对待识别图像的提取脸图像的方法从候选待识别图像中提取脸图像作为候选脸图像,此处不再赘述。Here, the execution subject may use the above-mentioned method for extracting a face image for an image to be identified to extract a face image from a candidate image to be identified as a candidate face image, and details are not described herein again.
接着,确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件。Next, it is determined whether the matching index in the matching information corresponding to the determined reference face image meets a preset condition.
其中,预设条件可以用于限制基准脸图像所对应的脸的类别与基准脸图像所输入的第二识别模型所对应的脸的类别的匹配程度的高低。具体的,技术人员可以预先设置匹配阈值。进而,当匹配指数与 匹配程度的对应关系为匹配指数越大,匹配程度越高时,上述预设条件可以为匹配指数大于等于匹配阈值;当匹配指数与匹配程度的对应关系为匹配指数越小,匹配程度越高时,上述预设条件可以为匹配指数小于等于匹配阈值。The preset condition may be used to limit the degree of matching between the category of the face corresponding to the reference face image and the category of the face corresponding to the second recognition model input to the reference face image. Specifically, a technician can set a matching threshold in advance. Furthermore, when the corresponding relationship between the matching index and the degree of matching is that the larger the matching index, the higher the degree of matching, the above preset condition may be that the matching index is greater than or equal to the matching threshold; when the corresponding relationship between the matching index and the degree of matching is the smaller the matching index When the degree of matching is higher, the above preset condition may be that the matching index is less than or equal to the matching threshold.
最后,响应于确定所确定的基准脸图像所对应的匹配信息中的匹配指数符合预设条件,将所提取的候选脸图像输入所确定的基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。Finally, in response to determining that the matching index in the matching information corresponding to the determined reference face image meets a preset condition, the extracted candidate face image is input to the second recognition model input to the determined reference face image to obtain the extracted Key point information and matching information corresponding to the candidate face image of.
可以理解的是,对于上述图像序列中位于待识别图像之后的任意一个图像,均可以利用位于该图像之前的图像所对应的匹配信息和第二识别模型,采用本实现方式描述的方法,生成该图像中的脸图像所对应的关键点信息和匹配信息,具体步骤可参照上述信息生成步骤,此处不再赘述。It can be understood that, for any image in the above image sequence that is located after the image to be identified, the matching information corresponding to the image located before the image and the second recognition model can be used to generate the image by using the method described in this implementation manner. For the key point information and matching information corresponding to the face image in the image, the specific steps can be referred to the above information generation steps, which will not be repeated here.
继续参见图3,图3是根据本实施例的用于生成信息的方法的应用场景的一个示意图。在图3的应用场景中,服务器301首先获取终端设备302发送的待识别图像303,其中,待识别图像303包括脸图像3031、3032。然后,服务器301从待识别图像303中提取出脸图像3031和脸图像3032,以及将脸图像3031和脸图像3032分别输入预先训练的第一识别模型304,获得脸图像3031所对应的识别结果“猫”3051和脸图像3032所对应的识别结果“狗”3052。接着,服务器301可以获取候选识别模型集合306,其中,候选识别模型集合306包括候选识别模型3061、3062、3063。技术人员预先设置了候选识别模型与识别结果的对应关系为:候选识别模型3061与识别结果“猫”相对应;候选识别模型3062与识别结果“狗”相对应;候选识别模型3063与识别结果“人”相对应。进而,服务器301可以从候选识别模型集合306中选取与所获得的识别结果“猫”3051相匹配的候选识别模型3061作为脸图像3031所对应的第二识别模型3071;从候选识别模型集合306中选取与所获得的识别结果“狗”3052相匹配的候选识别模型3062作为脸图像3032所对应的第二识别模型3072。最后,服务器301可以将脸图像3031输入第二识别模型3071,获得脸图像3031所 对应的关键点信息3081;将脸图像3032输入第二识别模型3072,获得脸图像3032所对应的关键点信息3082,其中,关键点信息可以用于表征脸图像中的关键点在脸图像中的位置。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the method for generating information according to this embodiment. In the application scenario of FIG. 3, the server 301 first obtains an image to be identified 303 sent by the terminal device 302, where the image to be identified 303 includes face images 3031 and 3032. Then, the server 301 extracts the face image 3031 and the face image 3032 from the to-be-recognized image 303, and inputs the face image 3031 and the face image 3032 into the first trained first recognition model 304, respectively, and obtains the recognition result corresponding to the face image 3031 " The recognition result "dog" 3052 corresponding to the "cat" 3051 and the face image 3032. Next, the server 301 may obtain a candidate recognition model set 306, where the candidate recognition model set 306 includes candidate recognition models 3061, 3062, and 3063. The technician presets the correspondence between the candidate recognition model and the recognition result as follows: the candidate recognition model 3061 corresponds to the recognition result "cat"; the candidate recognition model 3062 corresponds to the recognition result "dog"; the candidate recognition model 3063 corresponds to the recognition result " Person ". Further, the server 301 may select a candidate recognition model 3061 that matches the obtained recognition result "cat" 3051 from the candidate recognition model set 306 as the second recognition model 3071 corresponding to the face image 3031; from the candidate recognition model set 306, A candidate recognition model 3062 that matches the obtained recognition result "dog" 3052 is selected as the second recognition model 3072 corresponding to the face image 3032. Finally, the server 301 may input the face image 3031 into the second recognition model 3071, and obtain the key point information 3081 corresponding to the face image 3031. Enter the face image 3032 into the second recognition model 3072, and obtain the key point information 3082 corresponding to the face image 3032. Among them, the key point information can be used to characterize the position of key points in the face image in the face image.
本申请的上述实施例提供的方法通过获取待识别图像,然后从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别,接着从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,最后将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置,从而可以利用预先训练的、用于识别不同类别的脸的候选识别模型对脸图像进行识别,以生成关键点信息,进而可以识别不同类别的脸图像,提高了信息生成的全面性,并且,利用与脸图像所对应的类别相匹配的候选识别模型进行识别,可以提供信息生成的准确性。The method provided by the foregoing embodiments of the present application obtains a corresponding image of the extracted face image by acquiring an image to be identified, and then extracting a face image from the image to be identified, and inputting the extracted face image into a first trained first recognition model. Recognition result, wherein the recognition result is used to characterize the category of the face corresponding to the face image, and then a candidate recognition model matching the obtained recognition result is selected as a second recognition model from the candidate recognition model set, and finally the extracted The face image is input to the selected second recognition model to obtain the key point information corresponding to the extracted face image. The key point information is used to characterize the position of the key point in the face image in the face image, so that pre-training can be used. The candidate recognition model for identifying different types of faces recognizes face images to generate key point information, which can identify different types of face images, improves the comprehensiveness of information generation, and uses the correspondence with face images. Recognition of candidate recognition models that match the categories, can provide accuracy of information generation
进一步参考图4,其示出了用于生成信息的方法的又一个实施例的流程400。该用于生成信息的方法的流程400,包括以下步骤:With further reference to FIG. 4, a flowchart 400 of yet another embodiment of a method for generating information is shown. The process 400 of the method for generating information includes the following steps:
步骤401,获取待识别图像。Step 401: Obtain an image to be identified.
在本实施例中,用于生成信息的方法的执行主体(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式获取待识别图像。其中,待识别图像可以包括脸图像。具体的,待识别图像所包括的脸图像可以包括动物脸图像,也可以包括人脸图像。动物脸图像所对应的动物脸可以为各种类别的动物脸,例如狗脸、猫脸等。In this embodiment, an execution subject (for example, a server shown in FIG. 1) of the method for generating information may obtain an image to be identified through a wired connection method or a wireless connection method. The image to be identified may include a face image. Specifically, the face image included in the image to be identified may include an animal face image, and may also include a human face image. The animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
需要说明的是,步骤401可以采用与前述实施例中的步骤201类似的方式实现。相应地,上文针对步骤201的描述也适可用于本实施例的步骤401,此处不再赘述。It should be noted that step 401 may be implemented in a manner similar to step 201 in the foregoing embodiment. Correspondingly, the above description of step 201 is also applicable to step 401 of this embodiment, and details are not described herein again.
步骤402,从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息。In step 402, a face image is extracted from the image to be recognized, and the extracted face image is input into a first trained first recognition model to obtain a recognition result and reference keypoint information corresponding to the extracted face image.
在本实施例中,基于步骤401中得到的待识别图像,上述执行主体可以首先从待识别图像中提取脸图像,而后将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息。其中,识别结果可以包括但不限于以下至少一项:文字、数字、符号、图像、音频。识别结果可以用于表征脸图像所对应的脸的类别。基准关键点信息可以包括但不限于以下至少一项:文字、数字。符号、图像,基准关键点信息可以用于表征脸图像中的基准关键点在脸图像中的位置。基准关键点可以为用于确定脸图像中的关键点的点,例如鼻子的鼻尖部位所在的点、嘴巴的嘴角部位所在的点等。In this embodiment, based on the to-be-recognized image obtained in step 401, the execution subject may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain the extracted The recognition result and reference keypoint information corresponding to the face image. The recognition result may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio. The recognition result can be used to characterize the category of the face corresponding to the face image. The benchmark keypoint information may include but is not limited to at least one of the following: text, numbers. The symbol, image, and reference keypoint information can be used to characterize the position of the reference keypoint in the face image in the face image. The reference key point may be a point used to determine a key point in the face image, for example, a point where the nose tip of the nose is located, a point where the corner of the mouth is located, and the like.
在本实施例中,上述执行主体可以采用图2所对应的实施例中的脸图像的提取方式从待识别图像中提取脸图像,此处不再赘述。In this embodiment, the above-mentioned execution subject may use the face image extraction method in the embodiment corresponding to FIG. 2 to extract the face image from the image to be identified, and details are not described herein again.
需要说明的是,本实施例中的第一识别模型可以用于表征脸图像与脸图像所对应的识别结果和基准关键点信息的对应关系。相应的,对于本实施例中的第一识别模型的训练,其所对应的训练样本集中的训练样本可以包括样本脸图像和针对样本脸图像预先标注的样本识别结果和样本基准关键点信息。其中,样本基准关键点信息可以用于表征样本脸图像中的基准关键点在样本脸图像中的位置。It should be noted that the first recognition model in this embodiment may be used to characterize the correspondence between the recognition result corresponding to the face image and the face image and the reference keypoint information. Correspondingly, for the training of the first recognition model in this embodiment, the training samples in the corresponding training sample set may include sample face images, sample recognition results and sample reference keypoint information pre-labeled for the sample face images. The reference keypoint information of the sample can be used to characterize the position of the reference keypoint in the sample face image in the sample face image.
步骤403,从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。Step 403: Select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
在本实施例中,基于步骤402中得到的识别结果,上述执行主体可以从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。In this embodiment, based on the recognition result obtained in step 402, the execution entity may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model.
其中,候选识别模型集合中的候选识别模型可以为预先训练的、用于识别不同类别的脸以生成关键点信息的模型。候选识别模型可以用于表征脸图像及脸图像的基准关键点信息与脸图像的关键点信息的对应关系。具体的,候选识别模型可以为基于训练样本,利用机器学习方法,对初始模型(例如卷积神经网络、残差网络等)进行训练后得到的模型。The candidate recognition model in the candidate recognition model set may be a pre-trained model for recognizing faces of different classes to generate key point information. The candidate recognition model can be used to characterize the correspondence between the face image and the reference keypoint information of the face image and the keypoint information of the face image. Specifically, the candidate recognition model may be a model obtained by training an initial model (such as a convolutional neural network, a residual network, etc.) based on training samples and using a machine learning method.
作为示例,候选识别模型可以通过如下步骤训练得到:首先,获 取训练样本集,其中,训练样本可以包括样本脸图像和针对样本脸图像预先标注的样本关键点信息。然后,可以从训练样本集中选取训练样本,并执行以下模型训练步骤:提取所选取的训练样本的样本脸图像所对应的样本基准关键点信息,其中,样本基准关键点信息可以用于表征基准关键点在样本脸图像中的位置;将所选取的训练样本的样本脸图像和所提取的样本基准关键点信息输入初始模型,获得关键点信息;将所输入的样本脸图像所对应的样本关键点信息作为初始模型的期望输出,基于所获得的关键点信息和样本关键点信息,调整初始模型的参数;确定训练样本集中是否存在未被选取的训练样本;响应于不存在未被选取的训练样本,将调整后的初始模型确定为候选识别模型。As an example, the candidate recognition model can be obtained by training as follows. First, a training sample set is obtained, where the training sample may include a sample face image and sample keypoint information pre-labeled for the sample face image. Then, training samples can be selected from the training sample set and the following model training steps are performed: extracting the sample reference keypoint information corresponding to the sample face image of the selected training sample, wherein the sample reference keypoint information can be used to characterize the reference key The position of the point in the sample face image; the sample face image of the selected training sample and the extracted reference keypoint information of the sample are input into the initial model to obtain the keypoint information; the sample keypoint corresponding to the input sample face image The information is used as the expected output of the initial model. Based on the obtained keypoint information and sample keypoint information, adjust the parameters of the initial model; determine whether there are unselected training samples in the training sample set; respond to the absence of unselected training samples , Determine the adjusted initial model as a candidate recognition model.
在这里,需要说明的是,用于执行上述模型训练步骤的执行主体可以通过各种方式提取样本脸图像所对应的样本基准关键点信息。例如,可以将样本脸图像输入本实施例步骤402中的第一识别模型,获得样本脸图像所对应的样本基准关键点信息;或者,也可以输出样本脸图像,并获取用户针对样本脸图像标注的样本基准关键点信息。Here, it should be noted that the execution subject used to perform the above model training step can extract sample reference keypoint information corresponding to the sample face image in various ways. For example, the sample face image can be input to the first recognition model in step 402 of this embodiment to obtain the sample reference keypoint information corresponding to the sample face image; or the sample face image can be output and the user can be labeled for the sample face image The key information of the sample benchmark.
在该示例中,还可以包括以下步骤:响应于确定存在未被选取的训练样本,从未被选取的训练样本中重新选取训练样本,以及将最近一次调整的初始模型作为新的初始模型,继续执行上述模型训练步骤。In this example, the method may further include the following steps: in response to determining that there are unselected training samples, reselecting training samples from the unselected training samples, and using the most recently adjusted initial model as the new initial model, and continuing Perform the model training steps described above.
另外,需要说明的是,本实施例中的第二识别模型的选取可以采用与图2所对应的实施例中的第二识别模型的选取方法类似的方式实现,此处不再赘述。In addition, it should be noted that the selection of the second recognition model in this embodiment may be implemented in a manner similar to the method for selecting the second recognition model in the embodiment corresponding to FIG. 2, and is not described herein again.
步骤404,将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。Step 404: Input the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain keypoint information corresponding to the extracted face image.
在本实施例中,基于步骤403所选取的第二识别模型,上述执行主体可以将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。需要说明的是,在这里,对于所提取的脸图像中的脸图像,上述执行主体可以将该脸图像和该脸图像所对应的基准关键点信息输入该脸图像所对应的第二识别模型,获得该脸图像所对应的关键点信息。可以理解,脸 图像与第二识别模型的对应关系可以由脸图像所对应的识别结果与第二识别模型的对应关系确定。In this embodiment, based on the second recognition model selected in step 403, the execution subject may input the extracted face image and the obtained reference keypoint information into the selected second recognition model to obtain the extracted face image. Corresponding key point information. It should be noted that, for the face image in the extracted face image, the execution subject may input the face image and the reference keypoint information corresponding to the face image into the second recognition model corresponding to the face image. Obtain keypoint information corresponding to the face image. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
从图4中可以看出,与图2对应的实施例相比,本实施例中的用于生成信息的方法的流程400突出了生成脸图像所对应的基准关键点信息,以及基于所生成的基准关键点信息,生成脸图像所对应的关键点信息的步骤。由此,本实施例描述的方案可以以基准关键点信息作为参考,生成更为准确的关键点信息,进一步提高了信息生成的准确性。As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the method for generating information in this embodiment highlights the reference keypoint information corresponding to the generated face image, and based on the generated Steps of generating key point information corresponding to a face image based on the key point information. Therefore, the solution described in this embodiment can use the reference keypoint information as a reference to generate more accurate keypoint information, which further improves the accuracy of information generation.
进一步参考图5,作为对上述各图所示方法的实现,本申请提供了一种用于生成信息的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。With further reference to FIG. 5, as an implementation of the methods shown in the foregoing figures, this application provides an embodiment of an apparatus for generating information. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2. The device can be specifically applied to various electronic devices.
如图5所示,本实施例的用于生成信息的装置500包括:图像获取单元501、第一输入单元502、模型选取单元503和第二输入单元504。其中,图像获取单元501被配置成获取待识别图像,其中,待识别图像包括脸图像;第一输入单元502被配置成从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;模型选取单元503被配置成从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;第二输入单元504被配置成将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。As shown in FIG. 5, the apparatus 500 for generating information in this embodiment includes an image acquisition unit 501, a first input unit 502, a model selection unit 503, and a second input unit 504. The image acquisition unit 501 is configured to acquire an image to be identified, where the image to be identified includes a face image; the first input unit 502 is configured to extract a face image from the image to be identified, and input the extracted face image into a pre-training The first recognition model of the first recognition model obtains the recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the category of the face corresponding to the face image; the model selection unit 503 is configured to select a candidate recognition model set from the set of candidate recognition models. The candidate recognition model that matches the obtained recognition result is used as the second recognition model, wherein the candidate recognition model in the candidate recognition model set is a pre-trained model for identifying different classes of faces to generate key point information; the second input The unit 504 is configured to input the extracted face image into the selected second recognition model to obtain keypoint information corresponding to the extracted face image, wherein the keypoint information is used to characterize the keypoints in the face image in the face image. Location.
在本实施例中,用于生成信息的装置500的图像获取单元501可以通过有线连接方式或者无线连接方式获取待识别图像。其中,待识别图像可以包括脸图像。具体的,待识别图像所包括的脸图像可以包括动物脸图像,也可以包括人脸图像。动物脸图像所对应的动物脸可以为各种类别的动物脸,例如狗脸、猫脸等。In this embodiment, the image acquisition unit 501 of the apparatus 500 for generating information may acquire an image to be identified in a wired connection manner or a wireless connection manner. The image to be identified may include a face image. Specifically, the face image included in the image to be identified may include an animal face image, and may also include a human face image. The animal face corresponding to the animal face image may be various types of animal faces, such as a dog face, a cat face, and the like.
在本实施例中,基于图像获取单元501得到的待识别图像,第一输入单元502可以首先从待识别图像中提取脸图像,而后将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果。其中,识别结果可以包括但不限于以下至少一项:文字、数字、符号、图像、音频。识别结果可以用于表征脸图像所对应的脸的类别。In this embodiment, based on the to-be-recognized image obtained by the image acquisition unit 501, the first input unit 502 may first extract a face image from the to-be-recognized image, and then input the extracted face image into a pre-trained first recognition model to obtain Recognition result corresponding to the extracted face image. The recognition result may include, but is not limited to, at least one of the following: text, numbers, symbols, images, and audio. The recognition result can be used to characterize the category of the face corresponding to the face image.
需要说明的是,在本实施例中,待识别图像可以包括至少一个脸图像,对于至少一个脸图像中的脸图像,第一输入单元502可以将该脸图像输入预先训练的第一识别模型,获得该脸图像所对应的识别结果。其中,第一识别模型可以用于表征脸图像与脸图像所对应的识别结果的对应关系。It should be noted that, in this embodiment, the image to be recognized may include at least one face image. For the face image in the at least one face image, the first input unit 502 may input the face image into a first trained first recognition model. A recognition result corresponding to the face image is obtained. The first recognition model may be used to characterize a correspondence between a face image and a recognition result corresponding to the face image.
在本实施例中,基于第一输入单元502得到的识别结果,模型选取单元503可以从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型。其中,候选识别模型集合中的候选识别模型可以为预先训练的、用于识别不同类别的脸以生成关键点信息的模型。In this embodiment, based on the recognition result obtained by the first input unit 502, the model selection unit 503 may select a candidate recognition model that matches the obtained recognition result from the candidate recognition model set as the second recognition model. The candidate recognition model in the candidate recognition model set may be a pre-trained model for recognizing faces of different classes to generate key point information.
在本实施例中,候选识别模型可以用于表征脸图像与脸图像所对应的关键点信息的对应关系。其中,关键点信息可以用于表征脸图像中的关键点在脸图像中的位置。脸图像的关键点可以为具有明显语义区分度的点,例如用于表征鼻子的点、用于表征眼睛的点等。In this embodiment, the candidate recognition model may be used to characterize a correspondence between a face image and keypoint information corresponding to the face image. The keypoint information can be used to characterize the position of keypoints in the face image in the face image. The key points of the face image may be points with obvious semantic discrimination, such as points used to characterize the nose, points used to characterize the eyes, and so on.
在本实施例中,基于模型选取单元503所选取的第二识别模型,第二输入单元504可以将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。需要说明的是,在这里,对于所提取的脸图像中的脸图像,第二输入单元504可以将该脸图像输入该脸图像所对应的第二识别模型,获得该脸图像所对应的关键点信息。可以理解,脸图像与第二识别模型的对应关系可以由脸图像所对应的识别结果与第二识别模型的对应关系确定。In this embodiment, based on the second recognition model selected by the model selection unit 503, the second input unit 504 may input the extracted face image into the selected second recognition model to obtain a key corresponding to the extracted face image. Point information. It should be noted that, for a face image in the extracted face image, the second input unit 504 may input the face image into a second recognition model corresponding to the face image, and obtain a key point corresponding to the face image. information. It can be understood that the correspondence between the face image and the second recognition model can be determined by the correspondence between the recognition result corresponding to the face image and the second recognition model.
在本实施例的一些可选的实现方式中,第一输入单元502可以进一步被配置成:将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息,其中,基准 关键点信息可以用于表征脸图像中的基准关键点在脸图像中的位置;以及第二输入单元504可以进一步被配置成:将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。In some optional implementations of this embodiment, the first input unit 502 may be further configured to: input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image And reference keypoint information, where the reference keypoint information may be used to characterize the position of the reference keypoint in the face image in the face image; and the second input unit 504 may be further configured to: compare the extracted face image with the The obtained reference keypoint information is input into the selected second recognition model, and keypoint information corresponding to the extracted face image is obtained.
在本实施例的一些可选的实现方式中,第一输入单元502可以包括:第一输入模块(图中未示出),被配置成将待识别图像输入预先训练的第三识别模型,获得用于表征待识别图像中的脸图像在待识别图像中的位置的位置信息;图像提取模块(图中未示出),被配置成基于所获得的位置信息,从待识别图像中提取脸图像。In some optional implementations of this embodiment, the first input unit 502 may include a first input module (not shown in the figure) configured to input an image to be identified into a pre-trained third recognition model to obtain Position information for characterizing the position of a face image in the image to be identified in the image to be identified; an image extraction module (not shown in the figure) configured to extract a face image from the image to be identified based on the obtained position information .
在本实施例的一些可选的实现方式中,第二输入单元504可以进一步被配置成:将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息,其中,匹配信息可以包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。In some optional implementations of this embodiment, the second input unit 504 may be further configured to: input the extracted face image into the selected second recognition model to obtain the key points corresponding to the extracted face image Information and matching information, where the matching information may include a matching index used to characterize the degree of matching between the category of the face corresponding to the input face image and the category of the face corresponding to the second recognition model.
在本实施例的一些可选的实现方式中,图像获取单元501可以进一步被配置成:从目标视频所对应的图像序列中选取图像作为待识别图像,其中,目标视频可以为对脸进行拍摄所获得的视频。In some optional implementations of this embodiment, the image acquisition unit 501 may be further configured to: select an image to be identified from an image sequence corresponding to a target video, where the target video may be a place where a face is photographed. Get the video.
在本实施例的一些可选的实现方式中,装置500还可以包括:图像选取单元(图中未示出),被配置成从图像序列中选取位于待识别图像之后且与待识别图像相邻的图像作为候选待识别图像;图像确定单元(图中未示出),被配置成从候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像;条件确定单元(图中未示出),被配置成确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件;第三输入单元(图中未示出),被配置成响应于确定是,将所提取的候选脸图像输入所确定的基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。In some optional implementations of this embodiment, the apparatus 500 may further include: an image selecting unit (not shown in the figure), configured to select from the image sequence to be located behind the image to be identified and adjacent to the image to be identified Image as a candidate to-be-recognized image; an image determination unit (not shown in the figure) is configured to extract a face image from the candidate to-be-recognized image as a candidate face image, and determine the face image in the extracted, to-be-recognized image Is a reference face image; a condition determination unit (not shown in the figure) is configured to determine whether a matching index in the matching information corresponding to the determined reference face image meets a preset condition; a third input unit (not shown in the figure) Out), configured to, in response to determining yes, input the extracted candidate face image into the second reference model input to the determined reference face image, to obtain keypoint information and matching information corresponding to the extracted candidate face image.
可以理解的是,该装置500中记载的诸单元与参考图2描述的方法中的各个步骤相对应。由此,上文针对方法描述的操作、特征以及产生的有益效果同样适用于装置500及其中包含的单元,在此不再赘 述。It can be understood that the units recorded in the apparatus 500 correspond to the steps in the method described with reference to FIG. 2. Therefore, the operations, features, and beneficial effects described above for the method are also applicable to the device 500 and the units included therein, and details are not described herein again.
本申请的上述实施例提供的装置500通过图像获取单元501获取待识别图像,然后第一输入单元502从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别,接着模型选取单元503从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,最后第二输入单元504将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置,从而可以利用预先训练的、用于识别不同类别的脸的候选识别模型对脸图像进行识别,以生成关键点信息,进而可以识别不同类别的脸图像,提高了信息生成的全面性,并且,利用与脸图像所对应的类别相匹配的候选识别模型进行识别,可以提供信息生成的准确性。The apparatus 500 provided by the foregoing embodiment of the present application obtains an image to be identified through the image acquisition unit 501, and then the first input unit 502 extracts a face image from the image to be identified, and inputs the extracted face image into a pre-trained first recognition model. To obtain the recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the category of the face corresponding to the face image, and then the model selection unit 503 selects a candidate recognition model set that matches the obtained recognition result. The candidate recognition model is used as the second recognition model. Finally, the second input unit 504 inputs the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image. The key point information is used for Characterize the position of key points in the face image in the face image, so that the face image can be identified using a candidate training model that is pre-trained to identify different types of faces to generate key point information, which can then identify different types of The face image improves the comprehensiveness of information generation, and uses the image corresponding to the face image. Other candidate recognition models to match the identified, the accuracy of the information generated may be provided.
下面参考图6,其示出了适于用来实现本申请实施例的电子设备(例如图1所示的终端设备/服务器)的计算机系统600的结构示意图。图6示出的电子设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which illustrates a schematic structural diagram of a computer system 600 suitable for implementing an electronic device (such as the terminal device / server shown in FIG. 1) in the embodiment of the present application. The electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608. Instead, perform various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口 605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the speaker; a storage portion 608 including a hard disk and the like And a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as needed, so that a computer program read therefrom is installed into the storage section 608 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present application are executed. It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括图像获取单元、第一输入单元、模型选取单元和第二输入单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,图像获取单元还可以被描述为“获取待识别图像的单元”。The units described in the embodiments of the present application may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an image acquisition unit, a first input unit, a model selection unit, and a second input unit. Wherein, the names of these units do not constitute a limitation on the unit itself in some cases, for example, the image acquisition unit may also be described as a “unit for acquiring an image to be identified”.
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取待识别图像,其中,待识别图像包括脸图像;从待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。As another aspect, the present application also provides a computer-readable medium, which may be included in the electronic device described in the foregoing embodiments; or may exist alone without being assembled into the electronic device in. The computer-readable medium carries one or more programs, and when the one or more programs are executed by the electronic device, the electronic device: obtains an image to be identified, wherein the image to be identified includes a face image; Extracting a face image from the face, and inputting the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize a category of a face corresponding to the face image; from From the candidate recognition model set, a candidate recognition model that matches the obtained recognition result is selected as the second recognition model. The candidate recognition model in the candidate recognition model set is a pre-trained, used to recognize faces of different categories to generate a key. Model of point information; inputting the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize the key points in the face image in the face image Location.
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution of the specific combination of the above technical features, but it should also cover the above technical features or Other technical solutions formed by arbitrarily combining their equivalent features. For example, a technical solution formed by replacing the above features with technical features disclosed in the present application (but not limited to) with similar functions.

Claims (14)

  1. 一种用于生成信息的方法,包括:A method for generating information, including:
    获取待识别图像,其中,所述待识别图像包括脸图像;Acquiring an image to be identified, where the image to be identified includes a face image;
    从所述待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;Extract a face image from the to-be-recognized image, and input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, wherein the recognition result is used to characterize the corresponding face image Category of face
    从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;From the candidate recognition model set, a candidate recognition model that matches the obtained recognition result is selected as the second recognition model. The candidate recognition model in the candidate recognition model set is a pre-trained, used to recognize faces of different categories to generate Model of key point information;
    将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。The extracted face image is input into the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize the position of the key point in the face image in the face image.
  2. 根据权利要求1所述的方法,其中,所述将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,包括:The method according to claim 1, wherein the inputting the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image comprises:
    将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息,其中,基准关键点信息用于表征脸图像中的基准关键点在脸图像中的位置;以及The extracted face image is input into a pre-trained first recognition model to obtain the recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference keypoint information is used to characterize the reference keypoint in the face image as a face Position in the image; and
    所述将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,包括:The inputting the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image includes:
    将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。The extracted face image and the obtained reference keypoint information are input into the selected second recognition model to obtain keypoint information corresponding to the extracted face image.
  3. 根据权利要求1或2所述的方法,其中,所述从所述待识别图像中提取脸图像,包括:The method according to claim 1 or 2, wherein the extracting a face image from the image to be identified comprises:
    将所述待识别图像输入预先训练的第三识别模型,获得用于表征所述待识别图像中的脸图像在所述待识别图像中的位置的位置信息;Inputting the to-be-recognized image into a pre-trained third recognition model to obtain position information for characterizing a position of a face image in the to-be-recognized image in the to-be-recognized image;
    基于所获得的位置信息,从所述待识别图像中提取脸图像。Based on the obtained position information, a face image is extracted from the image to be identified.
  4. 根据权利要求3所述的方法,其中,所述将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,包括:The method according to claim 3, wherein the inputting the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image comprises:
    将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息,其中,匹配信息包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。The extracted face image is input into the selected second recognition model to obtain keypoint information and matching information corresponding to the extracted face image, where the matching information includes a category for characterizing the face corresponding to the input face image The matching index of the degree of matching with the category of the face corresponding to the second recognition model.
  5. 根据权利要求4所述的方法,其中,所述获取待识别图像,包括:The method according to claim 4, wherein the acquiring an image to be identified comprises:
    从目标视频所对应的图像序列中选取图像作为待识别图像,其中,所述目标视频为对脸进行拍摄所获得的视频。An image is selected as an image to be identified from an image sequence corresponding to a target video, where the target video is a video obtained by shooting a face.
  6. 根据权利要求5所述的方法,其中,在所述将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息之后,所述方法还包括:The method according to claim 5, wherein after the inputting the extracted face image into the selected second recognition model to obtain keypoint information and matching information corresponding to the extracted face image, the method further comprises: include:
    从所述图像序列中选取位于所述待识别图像之后且与所述待识别图像相邻的图像作为候选待识别图像;Selecting an image located behind the image to be identified and adjacent to the image to be identified from the image sequence as a candidate image to be identified;
    从所述候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像;Extracting a face image from the candidate to-be-recognized image as a candidate face image, and determining the extracted face image in the to-be-recognized image as a reference face image;
    确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件;Determining whether the matching index in the matching information corresponding to the determined reference face image meets a preset condition;
    响应于确定是,将所提取的候选脸图像输入所确定的基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。In response to the determination of yes, the extracted candidate face image is input into the second reference model entered in the determined reference face image, and keypoint information and matching information corresponding to the extracted candidate face image are obtained.
  7. 一种用于生成信息的装置,包括:An apparatus for generating information includes:
    图像获取单元,被配置成获取待识别图像,其中,所述待识别图 像包括脸图像;An image acquisition unit configured to acquire an image to be identified, wherein the image to be identified includes a face image;
    第一输入单元,被配置成从所述待识别图像中提取脸图像,以及将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果,其中,识别结果用于表征脸图像所对应的脸的类别;The first input unit is configured to extract a face image from the to-be-recognized image, and input the extracted face image into a pre-trained first recognition model to obtain a recognition result corresponding to the extracted face image, where the recognition The result is used to characterize the category of the face corresponding to the face image;
    模型选取单元,被配置成从候选识别模型集合中选取与所获得的识别结果相匹配的候选识别模型作为第二识别模型,其中,候选识别模型集合中的候选识别模型为预先训练的、用于识别不同类别的脸以生成关键点信息的模型;The model selection unit is configured to select a candidate recognition model matching the obtained recognition result from the candidate recognition model set as a second recognition model, wherein the candidate recognition model in the candidate recognition model set is pre-trained and used for A model that recognizes different classes of faces to generate keypoint information;
    第二输入单元,被配置成将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息,其中,关键点信息用于表征脸图像中的关键点在脸图像中的位置。The second input unit is configured to input the extracted face image into the selected second recognition model to obtain key point information corresponding to the extracted face image, wherein the key point information is used to characterize key points in the face image Position in the face image.
  8. 根据权利要求7所述的装置,其中,所述第一输入单元进一步被配置成:The apparatus according to claim 7, wherein the first input unit is further configured to:
    将所提取的脸图像输入预先训练的第一识别模型,获得所提取的脸图像所对应的识别结果和基准关键点信息,其中,基准关键点信息用于表征脸图像中的基准关键点在脸图像中的位置;以及The extracted face image is input into a pre-trained first recognition model to obtain the recognition result and reference keypoint information corresponding to the extracted face image, wherein the reference keypoint information is used to characterize the reference keypoint in the face image as a face. Position in the image; and
    所述第二输入单元进一步被配置成:The second input unit is further configured:
    将所提取的脸图像和所获得的基准关键点信息输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息。The extracted face image and the obtained reference keypoint information are input into the selected second recognition model to obtain keypoint information corresponding to the extracted face image.
  9. 根据权利要求7或8所述的装置,其中,所述第一输入单元包括:The apparatus according to claim 7 or 8, wherein the first input unit comprises:
    第一输入模块,被配置成将所述待识别图像输入预先训练的第三识别模型,获得用于表征所述待识别图像中的脸图像在所述待识别图像中的位置的位置信息;A first input module configured to input the image to be identified into a pre-trained third recognition model, and obtain position information for characterizing a position of a face image in the image to be identified in the image to be identified;
    图像提取模块,被配置成基于所获得的位置信息,从所述待识别图像中提取脸图像。The image extraction module is configured to extract a face image from the to-be-recognized image based on the obtained position information.
  10. 根据权利要求9所述的装置,其中,所述第二输入单元进一步被配置成:The apparatus according to claim 9, wherein the second input unit is further configured to:
    将所提取的脸图像输入所选取的第二识别模型,获得所提取的脸图像所对应的关键点信息和匹配信息,其中,匹配信息包括用于表征所输入的脸图像所对应的脸的类别与第二识别模型所对应的脸的类别的匹配程度的匹配指数。The extracted face image is input into the selected second recognition model to obtain keypoint information and matching information corresponding to the extracted face image, where the matching information includes a category for characterizing the face corresponding to the input face image The matching index of the degree of matching with the category of the face corresponding to the second recognition model.
  11. 根据权利要求10所述的装置,其中,所述图像获取单元进一步被配置成:The apparatus according to claim 10, wherein the image acquisition unit is further configured to:
    从目标视频所对应的图像序列中选取图像作为待识别图像,其中,所述目标视频为对脸进行拍摄所获得的视频。An image is selected as an image to be identified from an image sequence corresponding to a target video, where the target video is a video obtained by shooting a face.
  12. 根据权利要求11所述的装置,其中,所述装置还包括:The apparatus according to claim 11, wherein the apparatus further comprises:
    图像选取单元,被配置成从所述图像序列中选取位于所述待识别图像之后且与所述待识别图像相邻的图像作为候选待识别图像;An image selecting unit configured to select an image located behind the image to be identified and adjacent to the image to be identified from the image sequence as a candidate image to be identified;
    图像确定单元,被配置成从所述候选待识别图像中提取脸图像作为候选脸图像,以及将所提取的、待识别图像中的脸图像确定为基准脸图像;An image determining unit configured to extract a face image from the candidate to-be-recognized image as a candidate face image, and determine the extracted face image from the to-be-recognized image as a reference face image;
    条件确定单元,被配置成确定所确定的基准脸图像所对应的匹配信息中的匹配指数是否符合预设条件;A condition determining unit configured to determine whether a matching index in the matching information corresponding to the determined reference face image meets a preset condition;
    第三输入单元,被配置成响应于确定是,将所提取的候选脸图像输入所确定的基准脸图像所输入的第二识别模型,获得所提取的候选脸图像所对应的关键点信息和匹配信息。The third input unit is configured to, in response to determining yes, input the extracted candidate face image into the second reference model input to the determined reference face image, to obtain keypoint information and matching corresponding to the extracted candidate face image information.
  13. 一种电子设备,包括:An electronic device includes:
    一个或多个处理器;One or more processors;
    存储装置,其上存储有一个或多个程序,A storage device on which one or more programs are stored,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-6.
  14. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-6中任一所述的方法。A computer-readable medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.
PCT/CN2018/116182 2018-07-27 2018-11-19 Method and device used for generating information WO2020019591A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810846313.7A CN109034069B (en) 2018-07-27 2018-07-27 Method and apparatus for generating information
CN201810846313.7 2018-07-27

Publications (1)

Publication Number Publication Date
WO2020019591A1 true WO2020019591A1 (en) 2020-01-30

Family

ID=64647253

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116182 WO2020019591A1 (en) 2018-07-27 2018-11-19 Method and device used for generating information

Country Status (2)

Country Link
CN (1) CN109034069B (en)
WO (1) WO2020019591A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461352A (en) * 2020-04-17 2020-07-28 支付宝(杭州)信息技术有限公司 Model training method, business node identification method, device and electronic equipment
CN112241709A (en) * 2020-10-21 2021-01-19 北京字跳网络技术有限公司 Image processing method, and training method and device of beard transformation network
CN112926479A (en) * 2021-03-08 2021-06-08 新疆爱华盈通信息技术有限公司 Cat face identification method and system, electronic device and storage medium
CN113221767A (en) * 2021-05-18 2021-08-06 北京百度网讯科技有限公司 Method for training living body face recognition model and method for recognizing living body face and related device
CN113808044A (en) * 2021-09-17 2021-12-17 北京百度网讯科技有限公司 Encryption mask determining method, device, equipment and storage medium

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740567A (en) * 2019-01-18 2019-05-10 北京旷视科技有限公司 Key point location model training method, localization method, device and equipment
CN109919244B (en) * 2019-03-18 2021-09-07 北京字节跳动网络技术有限公司 Method and apparatus for generating a scene recognition model
CN110347134A (en) * 2019-07-29 2019-10-18 南京图玩智能科技有限公司 A kind of AI intelligence aquaculture specimen discerning method and cultivating system
CN110688894A (en) * 2019-08-22 2020-01-14 平安科技(深圳)有限公司 Palm key point extraction method and device
CN115240230A (en) * 2022-09-19 2022-10-25 星宠王国(北京)科技有限公司 Canine face detection model training method and device, and detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070183665A1 (en) * 2006-02-06 2007-08-09 Mayumi Yuasa Face feature point detecting device and method
CN104715227A (en) * 2013-12-13 2015-06-17 北京三星通信技术研究有限公司 Method and device for locating key points of human face
CN105512627A (en) * 2015-12-03 2016-04-20 腾讯科技(深圳)有限公司 Key point positioning method and terminal
CN105760836A (en) * 2016-02-17 2016-07-13 厦门美图之家科技有限公司 Multi-angle face alignment method based on deep learning and system thereof and photographing terminal
CN106295591A (en) * 2016-08-17 2017-01-04 乐视控股(北京)有限公司 Gender identification method based on facial image and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1266642C (en) * 2003-10-09 2006-07-26 重庆大学 Multi-category-based human face classifying and identifying method
WO2008139093A2 (en) * 2007-04-06 2008-11-20 France Telecom Determination of a model of image category
CN107103269A (en) * 2016-02-23 2017-08-29 芋头科技(杭州)有限公司 One kind expression feedback method and intelligent robot
CN107491726B (en) * 2017-07-04 2020-08-04 重庆邮电大学 Real-time expression recognition method based on multichannel parallel convolutional neural network
CN108197644A (en) * 2017-12-27 2018-06-22 深圳市大熊动漫文化有限公司 A kind of image-recognizing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070183665A1 (en) * 2006-02-06 2007-08-09 Mayumi Yuasa Face feature point detecting device and method
CN104715227A (en) * 2013-12-13 2015-06-17 北京三星通信技术研究有限公司 Method and device for locating key points of human face
CN105512627A (en) * 2015-12-03 2016-04-20 腾讯科技(深圳)有限公司 Key point positioning method and terminal
CN105760836A (en) * 2016-02-17 2016-07-13 厦门美图之家科技有限公司 Multi-angle face alignment method based on deep learning and system thereof and photographing terminal
CN106295591A (en) * 2016-08-17 2017-01-04 乐视控股(北京)有限公司 Gender identification method based on facial image and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461352A (en) * 2020-04-17 2020-07-28 支付宝(杭州)信息技术有限公司 Model training method, business node identification method, device and electronic equipment
CN111461352B (en) * 2020-04-17 2023-05-09 蚂蚁胜信(上海)信息技术有限公司 Model training method, service node identification device and electronic equipment
CN112241709A (en) * 2020-10-21 2021-01-19 北京字跳网络技术有限公司 Image processing method, and training method and device of beard transformation network
CN112926479A (en) * 2021-03-08 2021-06-08 新疆爱华盈通信息技术有限公司 Cat face identification method and system, electronic device and storage medium
CN113221767A (en) * 2021-05-18 2021-08-06 北京百度网讯科技有限公司 Method for training living body face recognition model and method for recognizing living body face and related device
CN113221767B (en) * 2021-05-18 2023-08-04 北京百度网讯科技有限公司 Method for training living body face recognition model and recognizing living body face and related device
CN113808044A (en) * 2021-09-17 2021-12-17 北京百度网讯科技有限公司 Encryption mask determining method, device, equipment and storage medium
CN113808044B (en) * 2021-09-17 2022-11-01 北京百度网讯科技有限公司 Encryption mask determining method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109034069B (en) 2021-04-09
CN109034069A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
WO2020019591A1 (en) Method and device used for generating information
CN109726624B (en) Identity authentication method, terminal device and computer readable storage medium
CN107492379B (en) Voiceprint creating and registering method and device
KR102394756B1 (en) Method and apparatus for processing video
WO2020006961A1 (en) Image extraction method and device
CN108509915B (en) Method and device for generating face recognition model
WO2019242222A1 (en) Method and device for use in generating information
WO2020024484A1 (en) Method and device for outputting data
WO2020000876A1 (en) Model generating method and device
CN110740389B (en) Video positioning method, video positioning device, computer readable medium and electronic equipment
WO2020029466A1 (en) Image processing method and apparatus
US11126827B2 (en) Method and system for image identification
CN108549848B (en) Method and apparatus for outputting information
WO2020006964A1 (en) Image detection method and device
CN110363220B (en) Behavior class detection method and device, electronic equipment and computer readable medium
JP7394809B2 (en) Methods, devices, electronic devices, media and computer programs for processing video
WO2021083069A1 (en) Method and device for training face swapping model
WO2020029608A1 (en) Method and apparatus for detecting burr of electrode sheet
CN109582825B (en) Method and apparatus for generating information
KR20200109239A (en) Image processing method, device, server and storage medium
WO2022105118A1 (en) Image-based health status identification method and apparatus, device and storage medium
CN109947971B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN112507090A (en) Method, apparatus, device and storage medium for outputting information
CN108399401B (en) Method and device for detecting face image
WO2020007191A1 (en) Method and apparatus for living body recognition and detection, and medium and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18927263

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 18.05.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18927263

Country of ref document: EP

Kind code of ref document: A1