CN109829432A - Method and apparatus for generating information - Google Patents
Method and apparatus for generating information Download PDFInfo
- Publication number
- CN109829432A CN109829432A CN201910099415.1A CN201910099415A CN109829432A CN 109829432 A CN109829432 A CN 109829432A CN 201910099415 A CN201910099415 A CN 201910099415A CN 109829432 A CN109829432 A CN 109829432A
- Authority
- CN
- China
- Prior art keywords
- video frame
- sample
- key point
- face key
- point information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
Embodiment of the disclosure discloses the method and apparatus for generating information.One specific embodiment of this method includes: that the REF video frame of target video frame and target video frame is extracted from target face video, wherein REF video frame is adjacent with target video frame;Determine face key point information corresponding to REF video frame, and based on identified face key point information and pre-set image, generate hotspot graph corresponding to REF video frame, wherein, the image-region of hotspot graph includes numerical value set, for the numerical value in numerical value set, the numerical value is for characterizing face key point in the probability of the numerical value position;By the first identification model of target video frame, REF video frame and hotspot graph generated input training in advance, face key point information corresponding to target video frame is obtained.The embodiment helps to reduce face key point in the shake of continuous video interframe, improves the stability of face key point location.
Description
Technical field
Embodiment of the disclosure is related to field of computer technology, more particularly, to generates the method and apparatus of information.
Background technique
With popularizing for mobile video software, various video processnig algorithms are also widely used.Video human face closes
Key point tracks one of the based process function as video, is also widely used.
The method that existing video human face key point tracking is generally basede on image face critical point detection is realized, i.e., based on every
The facial image of frame obtains corresponding face key point.
Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for generating information.
In a first aspect, the embodiment of the present disclosure provides a kind of method for generating information, this method comprises: from target person
The REF video frame of target video frame and target video frame is extracted in face video, wherein REF video frame and target video frame phase
It is adjacent;It determines face key point information corresponding to REF video frame, and based on identified face key point information and presets
Image generates hotspot graph corresponding to REF video frame, wherein the shape size of pre-set image and REF video frame distinguishes phase
Together, the image-region of hotspot graph includes numerical value set, and for the numerical value in numerical value set, the numerical value is for characterizing face key point
Probability positioned at the numerical value position;Target video frame, REF video frame and hotspot graph generated are inputted into training in advance
The first identification model, obtain target video frame corresponding to face key point information.
In some embodiments, face key point information corresponding to REF video frame is determined, comprising: by REF video frame
Second identification model of input training in advance, obtains face key point information corresponding to REF video frame.
In some embodiments, training obtains the second identification model as follows: obtaining training sample set, wherein
Training sample includes sample facial image and the sample face key point information that marks in advance for sample facial image;Utilize machine
Device learning method, the sample facial image for the training sample that training sample is concentrated is as input, the sample face that will be inputted
Sample face key point information corresponding to image obtains the second identification model as desired output, training.
In some embodiments, the first identification model is obtained by following steps training: multiple Sample video frame groups are obtained,
Wherein, Sample video frame group includes extract from sample face video, adjacent two video frames;Multiple samples are regarded
Sample video frame group in frequency frame group executes following steps: sample object video frame and sample are determined from the Sample video frame group
This REF video frame;Determine face key point information corresponding to the sample REF video frame in the Sample video frame group, and
Determine face key point information corresponding to the sample object video frame in the Sample video frame group as sample face key point
Information;Based on face key point information and pre-set image corresponding to sample REF video frame, sample hotspot graph is generated;Utilize this
The sample face key point information of Sample video frame group, sample hotspot graph generated and sample object video frame, composition training
Sample;Using machine learning method, the Sample video frame group for including by the training sample in composed training sample and sample
Hotspot graph as input, by corresponding to the Sample video frame group inputted and sample hotspot graph, the sample of sample object video frame
This face key point information obtains the first identification model as desired output, training.
In some embodiments, face key point corresponding to the sample object video frame in the Sample video frame group is determined
Information is as sample face key point information, comprising: determines corresponding to the sample object video frame in the Sample video frame group
Initial Face key point information;Based on the face key point information and sample object video frame for being in advance sample REF video frame
The weight that Initial Face key point information is distributed respectively, to identified, sample REF video frame face key point information and
Identified, sample object video frame Initial Face key point information is weighted summation process, obtains processing result conduct
The sample face key point information of sample object video frame in the Sample video frame group.
In some embodiments, based on identified face key point information and pre-set image, REF video frame institute is generated
Corresponding hotspot graph, comprising: the face key point information institute for generating REF video frame on pre-set image using Gaussian function is right
The numerical value set answered;Based on the pre-set image including numerical value set generated, hotspot graph corresponding to REF video frame is generated.
Second aspect, embodiment of the disclosure provide it is a kind of for generating the device of information, the device include: extract it is single
Member is configured to extract the REF video frame of target video frame and target video frame from target face video, wherein benchmark view
Frequency frame is adjacent with target video frame;Determination unit is configured to determine face key point information corresponding to REF video frame, with
And based on identified face key point information and pre-set image, generate hotspot graph corresponding to REF video frame, wherein default
Image is identical as the shape size difference of REF video frame, and the image-region of hotspot graph includes numerical value set, for numerical value set
In numerical value, the numerical value is for characterizing face key point in the probability of the numerical value position;Generation unit, be configured to by
First identification model of target video frame, REF video frame and hotspot graph generated input training in advance, obtains target video
Face key point information corresponding to frame.
In some embodiments, determination unit is further configured to: by the second of REF video frame input training in advance
Identification model obtains face key point information corresponding to REF video frame.
In some embodiments, training obtains the second identification model as follows: obtaining training sample set, wherein
Training sample includes sample facial image and the sample face key point information that marks in advance for sample facial image;Utilize machine
Device learning method, the sample facial image for the training sample that training sample is concentrated is as input, the sample face that will be inputted
Sample face key point information corresponding to image obtains the second identification model as desired output, training.
In some embodiments, the first identification model is obtained by following steps training: multiple Sample video frame groups are obtained,
Wherein, Sample video frame group includes extract from sample face video, adjacent two video frames;Multiple samples are regarded
Sample video frame group in frequency frame group executes following steps: sample object video frame and sample are determined from the Sample video frame group
This REF video frame;Determine face key point information corresponding to the sample REF video frame in the Sample video frame group, and
Determine face key point information corresponding to the sample object video frame in the Sample video frame group as sample face key point
Information;Based on face key point information and pre-set image corresponding to sample REF video frame, sample hotspot graph is generated;Utilize this
The sample face key point information of Sample video frame group, sample hotspot graph generated and sample object video frame, composition training
Sample;Using machine learning method, the Sample video frame group for including by the training sample in composed training sample and sample
Hotspot graph as input, by corresponding to the Sample video frame group inputted and sample hotspot graph, the sample of sample object video frame
This face key point information obtains the first identification model as desired output, training.
In some embodiments, face key point corresponding to the sample object video frame in the Sample video frame group is determined
Information is as sample face key point information, comprising: determines corresponding to the sample object video frame in the Sample video frame group
Initial Face key point information;Based on the face key point information and sample object video frame for being in advance sample REF video frame
The weight that Initial Face key point information is distributed respectively, to identified, sample REF video frame face key point information and
Identified, sample object video frame Initial Face key point information is weighted summation process, obtains processing result conduct
The sample face key point information of sample object video frame in the Sample video frame group.
In some embodiments, determination unit includes: the first generation module, is configured to using Gaussian function in default figure
The numerical value set as corresponding to the upper face key point information for generating REF video frame;Second generation module is configured to be based on
Pre-set image including numerical value set generated generates hotspot graph corresponding to REF video frame.
The third aspect, embodiment of the disclosure provide a kind of electronic equipment, comprising: one or more processors;Storage
Device is stored thereon with one or more programs, when one or more programs are executed by one or more processors, so that one
Or the method that multiple processors realize any embodiment in the above-mentioned method for generating information.
Fourth aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program,
The program realizes any embodiment in the above-mentioned method for generating information method when being executed by processor.
The method and apparatus for generating information that embodiment of the disclosure provides, by being extracted from target face video
The REF video frame of target video frame and target video frame, wherein REF video frame is adjacent with target video frame;Determine that benchmark regards
Face key point information corresponding to frequency frame, and based on identified face key point information and pre-set image, generate benchmark
Hotspot graph corresponding to video frame, wherein pre-set image is identical as the shape size difference of REF video frame, the image of hotspot graph
Region includes numerical value set, and for the numerical value in numerical value set, the numerical value is for characterizing face key point where the numerical value
The probability of position;The first identification model that the input of target video frame, REF video frame and hotspot graph generated is trained in advance,
Face key point information corresponding to target video frame is obtained, so as to close the face of the REF video frame of target video frame
Reference data of the key point information as the face key point information for generating target video frame helps to reduce face key point even
The shake of continuous video interframe improves the stability of face key point location.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating information of the disclosure;
Fig. 3 is the schematic diagram of a hotspot graph of embodiment of the disclosure;
Fig. 4 is according to an embodiment of the present disclosure for generating the schematic diagram of an application scenarios of the method for information;
Fig. 5 is the flow chart according to another embodiment of the method for generating information of the disclosure;
Fig. 6 is the structural schematic diagram according to one embodiment of the device for generating information of the disclosure;
Fig. 7 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of embodiment of the disclosure.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the method for generating information of the disclosure or the implementation of the device for generating information
The exemplary system architecture 100 of example.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed on terminal device 101,102,103, such as Video processing software,
Image processing software, web browser applications, searching class application, instant messaging tools, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, MP3 player
(Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3),
MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level
4) player, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software
It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it
The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to the mesh that terminal device 101,102,103 is shot
The background video processing server that mark face video is handled.Background video processing server can be to the target person received
The data such as face video carry out the processing such as analyzing, and obtain processing result (such as face key point corresponding to target video frame letter
Breath).
It should be noted that can be by server 105 for generating the method for information provided by embodiment of the disclosure
It executes, it can also be by terminal device 101,102,103;Correspondingly, it can be set for generating the device of information in server 105
In, it also can be set in terminal device 101,102,103.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module)
It is implemented as single software or software module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.Generating the target video frame institute in target face video
Used data do not need in the case where long-range obtain during corresponding face key point information, above system framework
It can not include network, and only include terminal device or server.
With continued reference to Fig. 2, the process of one embodiment of the method for generating information according to the disclosure is shown
200.The method for being used to generate information, comprising the following steps:
Step 201, the REF video frame of target video frame and target video frame is extracted from target face video.
In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information
It crosses wired connection mode or radio connection obtains target face video, and extract target view from target face video
The REF video frame of frequency frame and target video frame.Wherein, target face video is the people to carry out face critical point detection to it
Face video.Face video can be to carry out shooting video obtained to face.Video frame included by face video includes people
Face image.In practice, face key point can be point crucial in face, specifically, can be influence face mask or five
The point of official's shape.As an example, face key point can be point corresponding to point, eyes corresponding to nose etc..Specifically, mesh
Mark face video can store in above-mentioned executing subject, can also be by the electronic equipment that communicates to connect with above-mentioned executing subject
(such as terminal device shown in FIG. 1) is sent to above-mentioned executing subject.
In the present embodiment, target video frame is the video frame of the face key point information to be determined corresponding to it.Face
Key point information can include but is not limited at least one of following: text for characterizing the position of face key point in the video frame
Word, number, symbol, image.The REF video frame of target video frame is for determining the key of face corresponding to target video frame
The video frame of point information.Herein, REF video frame is adjacent with target video frame.Specifically, REF video frame can be target
In sequence of frames of video corresponding to face video, video frame adjacent with target video frame and before target video frame,
It can be video frame adjacent with target video frame and after target video frame.
In the present embodiment, above-mentioned executing subject can extract target video from target face video using various methods
The REF video frame of frame and target video frame.For example, target video frame can be extracted at random from target face video first, so
The video frame adjacent with target video frame is extracted afterwards as REF video frame;Alternatively, can be wrapped first from target face video
The highest video frame of clarity is extracted in the video frame included as target video frame, then extracts the view adjacent with target video frame
Frequency frame is as REF video frame.It should be noted that herein, after extracting target video frame, specifically extracting and being located at target view
Video frame before frequency frame still extracts the video frame after being located at target video frame as REF video frame as REF video
Frame can be predefined by technical staff, or can be random.
Step 202, face key point information corresponding to REF video frame is determined, and crucial based on identified face
Point information and pre-set image generate hotspot graph corresponding to REF video frame.
In the present embodiment, based on REF video frame obtained in step 201, above-mentioned executing subject can determine that benchmark regards
Face key point information corresponding to frequency frame.Specifically, above-mentioned executing subject can determine REF video frame by various methods
Corresponding face key point information.For example, above-mentioned executing subject can be used to show with outputting reference video frame, and obtains and use
Family is directed to the face key point information that REF video frame marks out.
In the present embodiment, based on identified face key point information and pre-set image, above-mentioned executing subject can be given birth to
At hotspot graph corresponding to REF video frame.Wherein.Pre-set image can be figure pre-set, for generating hotspot graph
Picture, pre-set image can be identical as the shape size difference of REF video frame.In addition, initial pictures can only include Background
Picture, without including foreground image.In turn, above-mentioned executing subject can add numerical value on initial pictures, to generate hotspot graph.Heat
The image-region of point diagram includes numerical value set.For the numerical value in numerical value set, the numerical value for characterize face key point in
The probability of the numerical value position.It is appreciated that since hotspot graph is identical as the shape size difference of REF video frame, so hot
The position where numerical value in point diagram can be corresponding with the position in REF video frame, and then hotspot graph can serve to indicate that base
The position of face key point in quasi- video frame.
It should be noted that hotspot graph may include at least two numerical value set, wherein at least two numerical value set
Each numerical value set can correspond to the face key point information of a REF video frame.
Specifically, in, hotspot graph corresponding with the position in the REF video frame that face key point information is characterized
Numerical value on position can be 1.According to each position in hotspot graph with numerical value 1 corresponding at a distance from position, each position
Corresponding numerical value can be gradually reduced.I.e. position corresponding to distance values 1 is remoter, and corresponding numerical value is smaller.
It should be noted that the position where numerical value in hotspot graph can be by the minimum rectangle for surrounding numerical value Lai really
It is fixed.Specifically, the center of above-mentioned minimum rectangle can be determined as numerical value position, alternatively, can be by minimum rectangle
Endpoint location be determined as numerical value position.
In the present embodiment, above-mentioned executing subject can be believed using the face key point of REF video frame by various modes
Breath generates numerical value set corresponding to the face key point information of REF video frame on initial pictures, and then obtains benchmark view
Hotspot graph corresponding to frequency frame.
In some optional implementations of the present embodiment, above-mentioned executing subject can generate benchmark by following steps
Hotspot graph corresponding to video frame: firstly, above-mentioned executing subject can use Gaussian function generates benchmark view on pre-set image
Numerical value set corresponding to the face key point information of frequency frame.Then, above-mentioned executing subject can be based on including number generated
The pre-set image of value set generates hotspot graph corresponding to REF video frame.Specifically, can will include the default of numerical value set
Image is determined directly as hotspot graph corresponding to REF video frame, alternatively, can also to include numerical value set pre-set image into
Row image procossing (such as addition background color), and image is determined as hotspot graph corresponding to REF video frame by treated.
In this implementation, above-mentioned executing subject can be right by position institute using position as the independent variable of Gaussian function
Dependent variable of the numerical value answered as Gaussian function, and then numerical value set is determined based on position.It is appreciated that herein, numerical value 1
Position corresponding to (i.e. face key point) is independent variable corresponding to the mathematic expectaion of Gaussian function.
As an example, Fig. 3 shows the schematic diagram of a hotspot graph of embodiment of the disclosure.It include a numerical value in figure
Gather 301, face key point information corresponding to the numerical value set 301 is face key point corresponding to face key point 302
Information.As shown in Figure 3.Numerical value on 302 position of face key point is 1.With the increasing at a distance from face key point 302
Greatly, numerical value is gradually decreased to 0.4 by 0.8, then is decreased to 0.1 by 0.4.It should be noted that herein, in hotspot graph not
Mark out the position of numerical value, corresponding to numerical value can be 0.It numerical value position can be by the minimum square for surrounding numerical value
Shape (such as appended drawing reference 303) determines.
Step 203, by the first identification of target video frame, REF video frame and hotspot graph generated input training in advance
Model obtains face key point information corresponding to target video frame.
In the present embodiment, it is based on obtaining in target video frame, REF video frame and step 202 obtained in step 201
Target video frame, REF video frame and hotspot graph generated can be inputted instruction in advance by the hotspot graph arrived, above-mentioned executing subject
The first experienced identification model obtains face key point information corresponding to target video frame.
In the present embodiment, the first identification model can be used for characterizing target video frame, REF video frame and REF video
The corresponding relationship of face key point information corresponding to hotspot graph corresponding to frame and target video frame.Specifically, as an example,
First identification model can be technical staff and be in advance based on to a large amount of target video frame, REF video frame, REF video frame institute
The statistics of face key point information corresponding to corresponding hotspot graph and target video frame and pre-establish, be stored with multiple mesh
Mark video frame, REF video frame, hotspot graph corresponding to REF video frame and the pass of face corresponding to corresponding target video frame
The mapping table of key point information;Or it is based on preset training sample, using machine learning method to initial model (example
Such as neural network) be trained after obtained model.
In some optional implementations of the present embodiment, the first identification model can be obtained by following steps training
:
Step 2031, multiple Sample video frame groups are obtained.
Wherein, Sample video frame group includes extract from sample face video, adjacent two video frames.Sample people
Face video is to carry out shooting face video obtained to face.Specifically, various methods can be used from sample face video
Disease extracts Sample video frame group.Such as can be extracted using the method extracted at random, alternatively, sample face video institute can be extracted
Two video frames of predeterminated position are arranged in corresponding sequence of frames of video as Sample video frame group.
Step 2032, for the Sample video frame group in multiple Sample video frame groups, following steps are executed: being regarded from the sample
Sample object video frame and sample REF video frame are determined in frequency frame group;Determine the sample REF video in the Sample video frame group
Face key point information corresponding to frame, and determine face corresponding to the sample object video frame in the Sample video frame group
Key point information is as sample face key point information;Based on face key point information corresponding to sample REF video frame and in advance
If image, sample hotspot graph is generated;Utilize the Sample video frame group, sample hotspot graph generated and sample object video frame
Sample face key point information forms training sample.
Herein, sample REF video frame is for determining face key point information corresponding to sample object video frame
Video frame, specifically, can determine sample object video frame and sample benchmark from the Sample video frame group using various methods
Video frame.For example, a Sample video frame can be selected as sample object video frame from the Sample video frame group at random, then
Non-selected Sample video frame is sample REF video frame in the Sample video frame group;Alternatively, the Sample video can be determined
Sample video frame putting in order in the sequence of frames of video corresponding to sample face video in frame group can will arrange in turn
The posterior Sample video frame of sequence is determined as sample object video frame, and the preceding Sample video frame that will sort is determined as sample benchmark view
Frequency frame.
It in this implementation, can be using crucial with the face described in step 202, for determining REF video frame
The similar method of the method for point information determines the key of face corresponding to the sample REF video frame in the Sample video frame group
Point information, details are not described herein again.
Herein, sample corresponding to the sample object video frame in the Sample video frame group can be determined using various methods
This face key point information.For example, can be using above-mentioned right with sample REF video frame institute that is determining in the Sample video frame group
The similar method of the method for the face key point information answered determines that the sample object video frame institute in the Sample video frame group is right
The sample face key point information answered.
In some optional implementations of the present embodiment, it can be determined by following steps in the Sample video frame group
Sample object video frame corresponding to face key point information as sample face key point information: it is possible, firstly, to determine should
Initial Face key point information corresponding to sample object video frame in Sample video frame group.It is then possible to based on being in advance
What the face key point information of sample REF video frame and the Initial Face key point information of sample object video frame were distributed respectively
Weight, to identified, sample REF video frame face key point information and identified, sample object video frame initial
Face key point information is weighted summation process, obtains processing result as the sample object video in the Sample video frame group
The sample face key point information of frame.
Wherein, Initial Face key point information is for characterizing position of the Initial Face key point in sample object video frame
It sets, can include but is not limited at least one of following: number, text, symbol, image.The initial people of sample object video frame
Face key point information can be used as the benchmark of the sample face key point information of sample object video frame, for determining sample object
The sample face key point information of video frame.Specifically, can using with described in step 202, for determining REF video
The similar method of the method for the face key point information of frame determines the first of the sample object video frame in the Sample video frame group
Beginning face key point information.
In this implementation, face key point information and sample object video frame for sample REF video frame just
The beginning pre-assigned weight of face key point information can be used for characterizing the face key point information and sample of sample REF video frame
Influence journey of the Initial Face key point information of this target video frame to the sample face key point information of sample object video frame
Degree.Specifically, the weight distributed is bigger, characterization gets over the influence degree of the sample face key point information of target video frame
It is high.
As an example, the face key point information of sample REF video frame is face key point in sample REF video frame
Coordinate (14,5).The Initial Face key point information of sample object video frame is Initial Face key point in sample object video
Coordinate (13,6) in frame.It is in advance the initial people of the face key point information of sample REF video frame and sample object video frame
The weight of face key point information distribution is respectively 0.4 and 0.6.Then the sample face key point information of sample object video frame is
(13.4,5.6), wherein 13.4=14 × 0.4+13 × 0.6;5.6=5 × 0.4+6 × 0.6.
It is appreciated that this implementation can the face in conjunction with corresponding to the sample REF video frame of sample object video frame
Key point information generates sample face key point information corresponding to sample object video frame, with this, sample face generated
Key point information may include the feature of the face key point in sample REF video frame, help to enhance adjacent video frame it
Between face key point continuity, and improve sample object video frame corresponding to sample face key point information it is accurate
Property.
Furthermore it is possible to using the method described in step 202, for generating hotspot graph corresponding to REF video frame,
Based on face key point information and pre-set image corresponding to sample REF video frame, generate corresponding to sample REF video frame
Sample hotspot graph, details are not described herein again.
Step 2033, using machine learning method, the sample for including by the training sample in composed training sample is regarded
Frequency frame group and sample hotspot graph as input, by corresponding to the Sample video frame group inputted and sample hotspot graph, sample mesh
The sample face key point information of video frame is marked as desired output, training obtains the first identification model.
Herein, it can use machine learning method, the sample for including by the training sample in composed training sample
The input of video frame group and sample hotspot graph as initial model, the Sample video frame group inputted and sample hotspot graph institute is right
Desired output of the sample face key point information answer, sample object video frame as initial model carries out initial model
Training, final training obtain the first identification model.
Herein, various existing convolutional neural networks structures can be used to be trained as initial model.Convolutional Neural
Network is a kind of feedforward neural network, its artificial neuron can respond the surrounding cells in a part of coverage area, for
Image procossing has outstanding performance, therefore, it is possible to using convolutional neural networks to the Sample video frame in composed training sample
Group and sample hotspot graph are handled.It should be noted that other model conducts with image processing function also can be used
Initial model, however it is not limited to which convolutional neural networks, specific model structure can set according to actual needs, not limit herein
It is fixed.
It should be noted that practice in, for the step of generating model executing subject can with for generating information
The executing subject of method is same or different.If identical, the executing subject for the step of generating model can be in training
It obtains that trained model is stored in local after model.If it is different, then the executing subject for the step of generating model can
Trained model to be sent to the executing subject for being used to generate the method for information after training obtains model.
With continued reference to the signal that Fig. 4, Fig. 4 are according to the application scenarios of the method for generating information of the present embodiment
Figure.In the application scenarios of Fig. 4, server 401 can obtain first is pre-stored within local target face video 402, with
And the REF video frame 404 of target video frame 403 and target video frame 403 is extracted from target face video 402, wherein base
Quasi- video frame 404 is adjacent with target video frame 403, for example, REF video frame 404 can for it is adjacent with target video frame 403,
And it is located at the video frame before target video frame 403.
Then, server 401 can determine face key point information 405 corresponding to REF video frame 404, and be based on
Identified face key point information 405 and pre-set image 406 generate hotspot graph 407 corresponding to REF video frame 404,
In, pre-set image 406 is identical as the shape size difference of REF video frame 404, and the image-region of hotspot graph 407 includes set of values
It closes, for the numerical value in numerical value set, the numerical value is for characterizing face key point in the probability of the numerical value position.
Finally, server 401 can be defeated by target video frame 403, REF video frame 404 and hotspot graph generated 407
Enter the first identification model 408 of training in advance, obtains face key point information 409 corresponding to target video frame 403.
The method provided by the above embodiment of the disclosure can be by the face key point of the REF video frame of target video frame
Reference data of the information as the face key point information for generating target video frame helps to reduce face key point continuous
The shake of video interframe improves the stability of face key point location.
With further reference to Fig. 5, it illustrates the processes 500 of another embodiment of the method for generating information.The use
In the process 500 for the method for generating information, comprising the following steps:
Step 501, the REF video frame of target video frame and target video frame is extracted from target face video.
In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information
It crosses wired connection mode or radio connection obtains target face video, and extract target view from target face video
The REF video frame of frequency frame and target video frame.Wherein, target face video is the people to carry out face critical point detection to it
Face video.Face video can be to carry out shooting video obtained to face.Video frame included by face video includes people
Face image.In practice, face key point can be point crucial in face, specifically, can be influence face mask or five
The point of official's shape.
In the present embodiment, target video frame is the video frame of the face key point information to be determined corresponding to it.Face
Key point information can include but is not limited at least one of following: text for characterizing the position of face key point in the video frame
Word, number, symbol, image.The REF video frame of target video frame is for determining the key of face corresponding to target video frame
The video frame of point information.Herein, REF video frame is adjacent with target video frame.
Step 502, it by the second identification model of REF video frame input training in advance, obtains corresponding to REF video frame
Face key point information, and based on identified face key point information and pre-set image, generate corresponding to REF video frame
Hotspot graph.
In the present embodiment, based on REF video frame obtained in step 501, above-mentioned executing subject can be by REF video
Second identification model of frame input training in advance, obtains face key point information corresponding to REF video frame.Wherein, second knows
Other model is used to characterize the corresponding relationship of face key point information corresponding to facial image and facial image.Specifically, conduct
Example, the second identification model can be in advance based on for technical staff to face corresponding to a large amount of facial image and facial image
The statistics of key point information and pre-establish, be stored with multiple facial images it is corresponding with corresponding face key point information close
It is table;Or it is based on preset training sample, initial model (such as neural network) is carried out using machine learning method
The model obtained after training.
In some optional implementations of the present embodiment, the second identification model can be trained as follows
It arrives: firstly, obtaining training sample set, wherein training sample includes sample facial image and marks in advance for sample facial image
The sample face key point information of note.Then, using machine learning method, the sample people for the training sample that training sample is concentrated
Face image is as input, using sample face key point information corresponding to the sample facial image inputted as desired output,
Training obtains the second identification model.
Specifically, the sample facial image for the training sample that training sample can be concentrated is as predetermined introductory die
Sample face key point information corresponding to the sample facial image inputted is made in the input of type (such as convolutional neural networks)
For the desired output of initial model, initial model is trained, final training obtains the second identification model.
It is appreciated that REF video frame is the video frame extracted from target face video.And target face video essence
On be one according to the time sequencing arrange human face image sequence.Therefore, REF video frame is substantially facial image.
In turn, above-mentioned executing subject can be based on the second identification model, determine face key point information corresponding to REF video frame.
In the present embodiment, based on identified face key point information and pre-set image, it is right to generate REF video frame institute
The method for the hotspot graph answered can be identical as the method in the embodiment corresponding to Fig. 2, and details are not described herein again.
Step 503, by the first identification of target video frame, REF video frame and hotspot graph generated input training in advance
Model obtains face key point information corresponding to target video frame.
In the present embodiment, it is based on obtaining in target video frame, REF video frame and step 502 obtained in step 501
Target video frame, REF video frame and hotspot graph generated can be inputted instruction in advance by the hotspot graph arrived, above-mentioned executing subject
The first experienced identification model obtains face key point information corresponding to target video frame.
In the present embodiment, the first identification model can be used for characterizing target video frame, REF video frame and REF video
The corresponding relationship of face key point information corresponding to hotspot graph corresponding to frame and target video frame.
Above-mentioned steps 501, step 503 are consistent with step 201, the step 203 in previous embodiment respectively, above with respect to step
Rapid 201 and the description of step 203 be also applied for step 501 and step 503, details are not described herein again.
From figure 5 it can be seen that the method for generating information compared with the corresponding embodiment of Fig. 2, in the present embodiment
Process 500 highlight the step of determining face key point information corresponding to REF video frame using the second identification model.By
This, the scheme of the present embodiment description can use the second identification model, generate people corresponding to more accurate, REF video frame
Face key point information can use face key point information corresponding to REF video frame, generate more accurate, mesh in turn
Face key point information corresponding to video frame is marked, the accuracy of information generation is improved.
With further reference to Fig. 6, as the realization to method shown in above-mentioned each figure, present disclose provides one kind for generating letter
One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer
For in various electronic equipments.
As shown in fig. 6, the present embodiment includes: extraction unit 601, determination unit 602 for generating the device 600 of information
With generation unit 603.Wherein, extraction unit 601 is configured to extract target video frame and target view from target face video
The REF video frame of frequency frame, wherein REF video frame is adjacent with target video frame;Determination unit 602 is configured to determine benchmark
Face key point information corresponding to video frame, and based on identified face key point information and pre-set image, generate base
Hotspot graph corresponding to quasi- video frame, wherein pre-set image is identical as the shape size difference of REF video frame, the figure of hotspot graph
As region includes numerical value set, for the numerical value in numerical value set, the numerical value is for characterizing face key point in the numerical value institute
Probability in position;Generation unit 603 is configured to input target video frame, REF video frame and hotspot graph generated pre-
First the first identification model of training obtains face key point information corresponding to target video frame.
In the present embodiment, for generate information device 600 extraction unit 601 can by wired connection mode or
Person's radio connection obtains target face video, and target video frame and target video frame are extracted from target face video
REF video frame.Wherein, target face video is the face video to carry out face critical point detection to it.Face video can
Think and face is carried out to shoot video obtained.Video frame included by face video includes facial image.In practice, face
Key point can be point crucial in face, specifically, can be the point of influence face mask or face shape.
In the present embodiment, target video frame is the video frame of the face key point information to be determined corresponding to it.Face
Key point information can include but is not limited at least one of following: text for characterizing the position of face key point in the video frame
Word, number, symbol, image.The REF video frame of target video frame is for determining the key of face corresponding to target video frame
The video frame of point information.Herein, REF video frame is adjacent with target video frame.
In the present embodiment, the REF video frame obtained based on extraction unit 601, determination unit 602 determine REF video
Face key point information corresponding to frame, and based on identified face key point information and pre-set image, generate benchmark view
Hotspot graph corresponding to frequency frame.Wherein.Pre-set image can be image pre-set, for generating hotspot graph, preset figure
Picture can be identical as the shape size difference of REF video frame.The image-region of hotspot graph includes numerical value set.For set of values
Numerical value in conjunction, the numerical value is for characterizing face key point in the probability of the numerical value position.
In the present embodiment, target video frame, REF video frame and the determination unit obtained based on extraction unit 601
602 obtained hotspot graphs, generation unit 603 can input target video frame, REF video frame and hotspot graph generated pre-
First the first identification model of training obtains face key point information corresponding to target video frame.Wherein, the first identification model can
For characterizing corresponding to hotspot graph corresponding to target video frame, REF video frame and REF video frame and target video frame
The corresponding relationship of face key point information.
In some optional implementations of the present embodiment, determination unit 602 can be further configured to: by benchmark
Second identification model of video frame input training in advance, obtains face key point information corresponding to REF video frame.
In some optional implementations of the present embodiment, the second identification model can be trained as follows
It arrives: obtaining training sample set, wherein training sample includes sample facial image and the sample that marks in advance for sample facial image
This face key point information;Using machine learning method, the sample facial image of the training sample that training sample is concentrated as
Sample face key point information corresponding to the sample facial image inputted, is used as desired output by input, trained to obtain the
Two identification models.
In some optional implementations of the present embodiment, the first identification model can be obtained by following steps training
: obtain multiple Sample video frame groups, wherein Sample video frame group include extracted from sample face video, it is adjacent
Two video frames;For the Sample video frame group in multiple Sample video frame groups, following steps are executed: from the Sample video frame group
Middle determining sample object video frame and sample REF video frame;Determine that the sample REF video frame institute in the Sample video frame group is right
The face key point information answered, and determine face key point corresponding to the sample object video frame in the Sample video frame group
Information is as sample face key point information;Based on face key point information corresponding to sample REF video frame and default figure
Picture generates sample hotspot graph;Utilize the sample of the Sample video frame group, sample hotspot graph and sample object video frame generated
Face key point information forms training sample;Using machine learning method, by the training sample packet in composed training sample
The Sample video frame group and sample hotspot graph included, will be corresponding to the Sample video frame group that inputted and sample hotspot graph as input
, the sample face key point information of sample object video frame as desired output, training obtains the first identification model.
In some optional implementations of the present embodiment, the sample object video frame in the Sample video frame group is determined
Corresponding face key point information is as sample face key point information, comprising: determines the sample in the Sample video frame group
Initial Face key point information corresponding to target video frame;Based on the face key point information for being in advance sample REF video frame
The weight distributed respectively with the Initial Face key point information of sample object video frame, to identified, sample REF video frame
Face key point information and identified, sample object video frame Initial Face key point information be weighted at summation
Reason obtains sample face key point information of the processing result as the sample object video frame in the Sample video frame group.
In some optional implementations of the present embodiment, determination unit 602 may include: the first generation module (figure
In be not shown), be configured to generate on pre-set image using Gaussian function the face key point information of REF video frame it is right
The numerical value set answered;Second generation module (not shown) is configured to based on default including numerical value set generated
Image generates hotspot graph corresponding to REF video frame.
It is understood that all units recorded in the device 600 and each step phase in the method with reference to Fig. 2 description
It is corresponding.As a result, above with respect to the operation of method description, the beneficial effect of feature and generation be equally applicable to device 600 and its
In include unit, details are not described herein.
The device provided by the above embodiment 600 of the disclosure can be crucial by the face of the REF video frame of target video frame
Reference data of the point information as the face key point information for generating target video frame helps to reduce face key point continuous
Video interframe shake, improve face key point location stability.
Below with reference to Fig. 7, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1
Server or terminal device) 700 structural schematic diagram.Terminal device in embodiment of the disclosure can include but is not limited to all
As mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP are (portable
Formula multimedia player), the mobile terminal and such as number TV, desk-top meter of car-mounted terminal (such as vehicle mounted guidance terminal) etc.
The fixed terminal of calculation machine etc..Terminal device or server shown in Fig. 7 are only an example, should not be to the implementation of the disclosure
The function and use scope of example bring any restrictions.
As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.)
701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708
Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment
Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704.
Input/output (I/O) interface 705 is also connected to bus 704.
In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 706 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 707 of dynamic device etc.;Storage device 708 including such as tape, hard disk etc.;And communication device 709.Communication device
709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool
There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.Each box shown in Fig. 7 can represent a device, can also root
According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708
It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.It should be noted that computer-readable medium described in embodiment of the disclosure can be with
It is computer-readable signal media or computer readable storage medium either the two any combination.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example of computer readable storage medium can include but is not limited to: have
The electrical connection of one or more conducting wires, portable computer diskette, hard disk, random access storage device (RAM), read-only memory
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer
Readable storage medium storing program for executing can be any tangible medium for including or store program, which can be commanded execution system, device
Either device use or in connection.And in embodiment of the disclosure, computer-readable signal media may include
In a base band or as the data-signal that carrier wave a part is propagated, wherein carrying computer-readable program code.It is this
The data-signal of propagation can take various forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate
Combination.Computer-readable signal media can also be any computer-readable medium other than computer readable storage medium, should
Computer-readable signal media can send, propagate or transmit for by instruction execution system, device or device use or
Person's program in connection.The program code for including on computer-readable medium can transmit with any suitable medium,
Including but not limited to: electric wire, optical cable, RF (radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment;It is also possible to individualism, and not
It is fitted into the electronic equipment.Above-mentioned computer-readable medium carries one or more program, when said one or more
When a program is executed by the electronic equipment, so that the electronic equipment: extracting target video frame and target from target face video
The REF video frame of video frame, wherein REF video frame is adjacent with target video frame;Determine face corresponding to REF video frame
Key point information, and based on identified face key point information and pre-set image, generate heat corresponding to REF video frame
Point diagram, wherein pre-set image is identical as the shape size difference of REF video frame, and the image-region of hotspot graph includes set of values
It closes, for the numerical value in numerical value set, the numerical value is for characterizing face key point in the probability of the numerical value position;By mesh
The first identification model for marking video frame, REF video frame and hotspot graph generated input training in advance, obtains target video frame
Corresponding face key point information.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, described program design language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor
Including extraction unit, determination unit and generation unit.Wherein, the title of these units is not constituted under certain conditions to the list
The restriction of member itself, for example, extraction unit is also described as " extracting the unit of video frame ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.
Claims (14)
1. a kind of method for generating information, comprising:
The REF video frame of target video frame and the target video frame is extracted from target face video, wherein the benchmark
Video frame is adjacent with the target video frame;
Determine face key point information corresponding to the REF video frame, and based on identified face key point information and
Pre-set image generates hotspot graph corresponding to the REF video frame, wherein the shape of pre-set image and the REF video frame
Size difference is identical, and the image-region of hotspot graph includes numerical value set, and for the numerical value in numerical value set, the numerical value is for characterizing
Face key point is in the probability of the numerical value position;
By the first identification mould of the target video frame, the REF video frame and hotspot graph generated input training in advance
Type obtains face key point information corresponding to the target video frame.
2. according to the method described in claim 1, wherein, face key point corresponding to the determination REF video frame is believed
Breath, comprising:
By the second identification model of REF video frame input training in advance, face corresponding to the REF video frame is obtained
Key point information.
3. according to the method described in claim 2, wherein, training obtains second identification model as follows:
Obtain training sample set, wherein training sample includes sample facial image and marks in advance for sample facial image
Sample face key point information;
Using machine learning method, the sample facial image for the training sample that the training sample is concentrated is as input, by institute
Sample face key point information corresponding to the sample facial image of input obtains the second identification mould as desired output, training
Type.
4. according to the method described in claim 1, wherein, first identification model is obtained by following steps training:
Obtain multiple Sample video frame groups, wherein Sample video frame group include extracted from sample face video, it is adjacent
Two video frames;
For the Sample video frame group in the multiple Sample video frame group, following steps are executed: from the Sample video frame group
Determine sample object video frame and sample REF video frame;It determines corresponding to the sample REF video frame in the Sample video frame group
Face key point information, and determine the Sample video frame group in sample object video frame corresponding to face key point letter
Breath is used as sample face key point information;Based on face key point information and pre-set image corresponding to sample REF video frame,
Generate sample hotspot graph;Utilize the sample people of the Sample video frame group, sample hotspot graph generated and sample object video frame
Face key point information forms training sample;
Using machine learning method, the Sample video frame group for including by the training sample in composed training sample and sample heat
Point diagram as input, by corresponding to the Sample video frame group inputted and sample hotspot graph, the sample of sample object video frame
Face key point information obtains the first identification model as desired output, training.
5. according to the method described in claim 4, wherein, the sample object video frame institute in the determination Sample video frame group
Corresponding face key point information is as sample face key point information, comprising:
Determine Initial Face key point information corresponding to the sample object video frame in the Sample video frame group;
Initial Face key point based on the face key point information and sample object video frame that are in advance sample REF video frame
The weight that information is distributed respectively, to identified, sample REF video frame face key point information and identified, sample mesh
The Initial Face key point information of mark video frame is weighted summation process, obtains processing result as in the Sample video frame group
Sample object video frame sample face key point information.
6. method described in one of -5 according to claim 1, wherein described based on identified face key point information and default
Image generates hotspot graph corresponding to the REF video frame, comprising:
Number corresponding to the face key point information of the REF video frame is generated on the pre-set image using Gaussian function
Value set;
Based on the pre-set image including numerical value set generated, hotspot graph corresponding to the REF video frame is generated.
7. a kind of for generating the device of information, comprising:
Extraction unit is configured to extract the REF video of target video frame and the target video frame from target face video
Frame, wherein the REF video frame is adjacent with the target video frame;
Determination unit, is configured to determine face key point information corresponding to the REF video frame, and based on determining
Face key point information and pre-set image, generate hotspot graph corresponding to the REF video frame, wherein pre-set image and institute
The shape size difference for stating REF video frame is identical, and the image-region of hotspot graph includes numerical value set, in numerical value set
Numerical value, the numerical value is for characterizing face key point in the probability of the numerical value position;
Generation unit is configured to input the target video frame, the REF video frame and hotspot graph generated preparatory
The first trained identification model obtains face key point information corresponding to the target video frame.
8. device according to claim 7, wherein the determination unit is further configured to:
By the second identification model of REF video frame input training in advance, face corresponding to the REF video frame is obtained
Key point information.
9. device according to claim 8, wherein training obtains second identification model as follows:
Obtain training sample set, wherein training sample includes sample facial image and marks in advance for sample facial image
Sample face key point information;
Using machine learning method, the sample facial image for the training sample that the training sample is concentrated is as input, by institute
Sample face key point information corresponding to the sample facial image of input obtains the second identification mould as desired output, training
Type.
10. device according to claim 7, wherein first identification model is obtained by following steps training:
Obtain multiple Sample video frame groups, wherein Sample video frame group include extracted from sample face video, it is adjacent
Two video frames;
For the Sample video frame group in the multiple Sample video frame group, following steps are executed: from the Sample video frame group
Determine sample object video frame and sample REF video frame;It determines corresponding to the sample REF video frame in the Sample video frame group
Face key point information, and determine the Sample video frame group in sample object video frame corresponding to face key point letter
Breath is used as sample face key point information;Based on face key point information and pre-set image corresponding to sample REF video frame,
Generate sample hotspot graph;Utilize the sample people of the Sample video frame group, sample hotspot graph generated and sample object video frame
Face key point information forms training sample;
Using machine learning method, the Sample video frame group for including by the training sample in composed training sample and sample heat
Point diagram as input, by corresponding to the Sample video frame group inputted and sample hotspot graph, the sample of sample object video frame
Face key point information obtains the first identification model as desired output, training.
11. device according to claim 10, wherein the sample object video frame in the determination Sample video frame group
Corresponding face key point information is as sample face key point information, comprising:
Determine Initial Face key point information corresponding to the sample object video frame in the Sample video frame group;
Initial Face key point based on the face key point information and sample object video frame that are in advance sample REF video frame
The weight that information is distributed respectively, to identified, sample REF video frame face key point information and identified, sample mesh
The Initial Face key point information of mark video frame is weighted summation process, obtains processing result as in the Sample video frame group
Sample object video frame sample face key point information.
12. the device according to one of claim 7-11, wherein the determination unit includes:
First generation module is configured to generate the face of the REF video frame on the pre-set image using Gaussian function
Numerical value set corresponding to key point information;
Second generation module is configured to generate the REF video based on the pre-set image including numerical value set generated
Hotspot graph corresponding to frame.
13. a kind of electronic equipment, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method as claimed in any one of claims 1 to 6.
14. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor
Such as method as claimed in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910099415.1A CN109829432B (en) | 2019-01-31 | 2019-01-31 | Method and apparatus for generating information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910099415.1A CN109829432B (en) | 2019-01-31 | 2019-01-31 | Method and apparatus for generating information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109829432A true CN109829432A (en) | 2019-05-31 |
CN109829432B CN109829432B (en) | 2020-11-20 |
Family
ID=66863303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910099415.1A Active CN109829432B (en) | 2019-01-31 | 2019-01-31 | Method and apparatus for generating information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109829432B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532891A (en) * | 2019-08-05 | 2019-12-03 | 北京地平线机器人技术研发有限公司 | Target object state identification method, device, medium and equipment |
CN111027495A (en) * | 2019-12-12 | 2020-04-17 | 京东数字科技控股有限公司 | Method and device for detecting key points of human body |
CN111028144A (en) * | 2019-12-09 | 2020-04-17 | 腾讯音乐娱乐科技(深圳)有限公司 | Video face changing method and device and storage medium |
CN111797665A (en) * | 2019-08-21 | 2020-10-20 | 北京沃东天骏信息技术有限公司 | Method and apparatus for converting video |
CN112101109A (en) * | 2020-08-11 | 2020-12-18 | 深圳数联天下智能科技有限公司 | Face key point detection model training method and device, electronic equipment and medium |
CN112381926A (en) * | 2020-11-13 | 2021-02-19 | 北京有竹居网络技术有限公司 | Method and apparatus for generating video |
CN113128436A (en) * | 2021-04-27 | 2021-07-16 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
CN113177603A (en) * | 2021-05-12 | 2021-07-27 | 中移智行网络科技有限公司 | Training method of classification model, video classification method and related equipment |
CN117437505A (en) * | 2023-12-18 | 2024-01-23 | 杭州任性智能科技有限公司 | Training data set generation method and system based on video |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102177726A (en) * | 2008-08-21 | 2011-09-07 | 杜比实验室特许公司 | Feature optimization and reliability estimation for audio and video signature generation and detection |
US20130195341A1 (en) * | 2012-01-31 | 2013-08-01 | Ge Medical Systems Global Technology Company | Method for sorting ct image slices and method for constructing 3d ct image |
CN107590482A (en) * | 2017-09-29 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | information generating method and device |
US20180101733A1 (en) * | 2012-06-20 | 2018-04-12 | Kuntal Sengupta | Active presence detection with depth sensing |
CN108205655A (en) * | 2017-11-07 | 2018-06-26 | 北京市商汤科技开发有限公司 | A kind of key point Forecasting Methodology, device, electronic equipment and storage medium |
CN108304765A (en) * | 2017-12-11 | 2018-07-20 | 中国科学院自动化研究所 | Multitask detection device for face key point location and semantic segmentation |
CN108960064A (en) * | 2018-06-01 | 2018-12-07 | 重庆锐纳达自动化技术有限公司 | A kind of Face datection and recognition methods based on convolutional neural networks |
US20180353072A1 (en) * | 2017-06-08 | 2018-12-13 | Fdna Inc. | Systems, methods, and computer-readable media for gene and genetic variant prioritization |
CN109034085A (en) * | 2018-08-03 | 2018-12-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109101919A (en) * | 2018-08-03 | 2018-12-28 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
-
2019
- 2019-01-31 CN CN201910099415.1A patent/CN109829432B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102177726A (en) * | 2008-08-21 | 2011-09-07 | 杜比实验室特许公司 | Feature optimization and reliability estimation for audio and video signature generation and detection |
US20130195341A1 (en) * | 2012-01-31 | 2013-08-01 | Ge Medical Systems Global Technology Company | Method for sorting ct image slices and method for constructing 3d ct image |
US20180101733A1 (en) * | 2012-06-20 | 2018-04-12 | Kuntal Sengupta | Active presence detection with depth sensing |
US20180353072A1 (en) * | 2017-06-08 | 2018-12-13 | Fdna Inc. | Systems, methods, and computer-readable media for gene and genetic variant prioritization |
CN107590482A (en) * | 2017-09-29 | 2018-01-16 | 百度在线网络技术(北京)有限公司 | information generating method and device |
CN108205655A (en) * | 2017-11-07 | 2018-06-26 | 北京市商汤科技开发有限公司 | A kind of key point Forecasting Methodology, device, electronic equipment and storage medium |
CN108304765A (en) * | 2017-12-11 | 2018-07-20 | 中国科学院自动化研究所 | Multitask detection device for face key point location and semantic segmentation |
CN108960064A (en) * | 2018-06-01 | 2018-12-07 | 重庆锐纳达自动化技术有限公司 | A kind of Face datection and recognition methods based on convolutional neural networks |
CN109034085A (en) * | 2018-08-03 | 2018-12-18 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
CN109101919A (en) * | 2018-08-03 | 2018-12-28 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating information |
Non-Patent Citations (2)
Title |
---|
ZHIWU HUANG: "A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database", 《IEEE TRANSACTIONS ON IMAGE PROCESSING 》 * |
刘春平: "基于人脸关键点的表情实时动态迁移", 《现代计算机》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532891A (en) * | 2019-08-05 | 2019-12-03 | 北京地平线机器人技术研发有限公司 | Target object state identification method, device, medium and equipment |
CN110532891B (en) * | 2019-08-05 | 2022-04-05 | 北京地平线机器人技术研发有限公司 | Target object state identification method, device, medium and equipment |
CN111797665A (en) * | 2019-08-21 | 2020-10-20 | 北京沃东天骏信息技术有限公司 | Method and apparatus for converting video |
CN111797665B (en) * | 2019-08-21 | 2023-12-08 | 北京沃东天骏信息技术有限公司 | Method and apparatus for converting video |
CN111028144A (en) * | 2019-12-09 | 2020-04-17 | 腾讯音乐娱乐科技(深圳)有限公司 | Video face changing method and device and storage medium |
CN111028144B (en) * | 2019-12-09 | 2023-06-20 | 腾讯音乐娱乐科技(深圳)有限公司 | Video face changing method and device and storage medium |
CN111027495A (en) * | 2019-12-12 | 2020-04-17 | 京东数字科技控股有限公司 | Method and device for detecting key points of human body |
CN112101109A (en) * | 2020-08-11 | 2020-12-18 | 深圳数联天下智能科技有限公司 | Face key point detection model training method and device, electronic equipment and medium |
CN112101109B (en) * | 2020-08-11 | 2024-04-30 | 深圳数联天下智能科技有限公司 | Training method and device for face key point detection model, electronic equipment and medium |
CN112381926A (en) * | 2020-11-13 | 2021-02-19 | 北京有竹居网络技术有限公司 | Method and apparatus for generating video |
CN113128436A (en) * | 2021-04-27 | 2021-07-16 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
CN113128436B (en) * | 2021-04-27 | 2022-04-01 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
CN113177603A (en) * | 2021-05-12 | 2021-07-27 | 中移智行网络科技有限公司 | Training method of classification model, video classification method and related equipment |
CN117437505A (en) * | 2023-12-18 | 2024-01-23 | 杭州任性智能科技有限公司 | Training data set generation method and system based on video |
Also Published As
Publication number | Publication date |
---|---|
CN109829432B (en) | 2020-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109829432A (en) | Method and apparatus for generating information | |
CN109858445A (en) | Method and apparatus for generating model | |
CN108805091B (en) | Method and apparatus for generating a model | |
CN111476871B (en) | Method and device for generating video | |
CN109993150B (en) | Method and device for identifying age | |
CN109800732A (en) | The method and apparatus for generating model for generating caricature head portrait | |
CN108830235A (en) | Method and apparatus for generating information | |
CN110288049A (en) | Method and apparatus for generating image recognition model | |
CN109086719A (en) | Method and apparatus for output data | |
CN109981787B (en) | Method and device for displaying information | |
CN110009059B (en) | Method and apparatus for generating a model | |
CN109815365A (en) | Method and apparatus for handling video | |
CN109977839A (en) | Information processing method and device | |
CN109948700A (en) | Method and apparatus for generating characteristic pattern | |
CN110084317A (en) | The method and apparatus of image for identification | |
CN110059624A (en) | Method and apparatus for detecting living body | |
CN109754464A (en) | Method and apparatus for generating information | |
CN110110666A (en) | Object detection method and device | |
CN110427915A (en) | Method and apparatus for output information | |
CN110111241A (en) | Method and apparatus for generating dynamic image | |
CN110097004B (en) | Facial expression recognition method and device | |
CN110008926B (en) | Method and device for identifying age | |
CN109117758A (en) | Method and apparatus for generating information | |
CN109829431A (en) | Method and apparatus for generating information | |
CN110046571B (en) | Method and device for identifying age |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |