CN108985178A - Method and apparatus for generating information - Google Patents
Method and apparatus for generating information Download PDFInfo
- Publication number
- CN108985178A CN108985178A CN201810643456.8A CN201810643456A CN108985178A CN 108985178 A CN108985178 A CN 108985178A CN 201810643456 A CN201810643456 A CN 201810643456A CN 108985178 A CN108985178 A CN 108985178A
- Authority
- CN
- China
- Prior art keywords
- video
- result
- network
- face video
- recognition result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Abstract
The embodiment of the present application discloses the method and apparatus for generating information.One specific embodiment of this method includes: to obtain target face video;The video identification model that target face video input is trained in advance, obtain recognition result corresponding to target face video, wherein, video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate whether face video is the video based on face image synthesis.The embodiment realizes the identification to the video of synthesis, improves the diversity of information generation, meanwhile, the safety of information exchange can be improved.
Description
Technical field
The invention relates to field of computer technology, more particularly, to generate the method and apparatus of information.
Background technique
Currently, realizing that information is shared by shooting video has become information sharing model important in people's life.It is real
In trampling, when user can not the face to target person carry out shooting obtain face video when, user may be synthesized by image
Technology utilizes the face video of the face image synthesis target person of target person.
It is understood that often bring infringement using the video that image synthesizes, damage the adverse effects such as fairness, because
This, the platform for carrying out information sharing can identify such video, and then intercept to it.
Summary of the invention
The embodiment of the present application proposes the method and apparatus for generating information.
In a first aspect, the embodiment of the present application provides a kind of method for generating information, this method comprises: obtaining target
Face video;The video identification model that target face video input is trained in advance obtains knowledge corresponding to target face video
Other result, wherein video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate
Whether face video is the video based on face image synthesis.
In some embodiments, video identification model includes that image zooming-out network, feature extraction network and result generate net
Network;And the video identification model for training target face video input in advance, obtain identification corresponding to target face video
As a result, comprising: target face video-input image is extracted into network, obtains at least two target facial images;It will be obtained
At least two target facial images distinguish input feature vector and extract network, obtain at least two groups characteristics of image;By it is obtained at least
Two groups of characteristics of image input results generate network, obtain recognition result corresponding to target face video.
In some embodiments, as a result generating network includes that the first result generates network and the second result generation network;With
And at least two groups characteristics of image input results obtained are generated into network, obtain identification knot corresponding to target face video
Fruit, comprising: at least two groups of characteristics of image are inputted into the first result respectively and generate network, obtain at least two initial recognition results;
At least two initial recognition result obtained is inputted into the second result and generates network, obtains knowledge corresponding to target face video
Other result.
In some embodiments, training obtains video identification model as follows: obtaining training sample set, wherein
Training sample includes sample face video and the specimen discerning that is marked in advance for sample face video is as a result, specimen discerning result
It is used to indicate whether sample face video is the video based on sample face image synthesis;The training sample that training sample is concentrated
Sample face video as input, using specimen discerning result corresponding to the sample face video inputted as expectation it is defeated
Out, video identification model is obtained using machine learning method training.
In some embodiments, after obtaining recognition result, this method further include: in response to determination identification obtained
As a result instruction target face video is the video based on face image synthesis, generates warning message.
Second aspect, the embodiment of the present application provide it is a kind of for generating the device of information, the device include: obtain it is single
Member is configured to obtain target face video;Input unit is configured to the video for training target face video input in advance
Identification model, obtain target face video corresponding to recognition result, wherein video identification model for characterize face video with
The corresponding relationship of recognition result, recognition result are used to indicate whether face video is the video based on face image synthesis.
In some embodiments, video identification model includes that image zooming-out network, feature extraction network and result generate net
Network;And input unit includes: the first input module, is configured to target face video-input image extracting network, obtain
At least two target facial images;Second input module is configured to distinguish at least two targets facial image obtained
Input feature vector extracts network, obtains at least two groups characteristics of image;Third input module is configured at least two groups obtained
Characteristics of image input results generate network, obtain recognition result corresponding to target face video.
In some embodiments, as a result generating network includes that the first result generates network and the second result generation network;With
And third input module is further configured to: at least two groups of characteristics of image being inputted the first result respectively and generate network, are obtained
At least two initial recognition results;At least two initial recognition result obtained is inputted into the second result and generates network, is obtained
Recognition result corresponding to target face video.
In some embodiments, training obtains video identification model as follows: obtaining training sample set, wherein
Training sample includes sample face video and the specimen discerning that is marked in advance for sample face video is as a result, specimen discerning result
It is used to indicate whether sample face video is the video based on sample face image synthesis;The training sample that training sample is concentrated
Sample face video as input, using specimen discerning result corresponding to the sample face video inputted as expectation it is defeated
Out, video identification model is obtained using machine learning method training.
In some embodiments, device further include: generation unit is configured in response to determine identification knot obtained
Fruit indicates that target face video is the video based on face image synthesis, generates warning message.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors;Storage dress
Set, be stored thereon with one or more programs, when one or more programs are executed by one or more processors so that one or
The method that multiple processors realize any embodiment in the above-mentioned method for generating information.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should
The method of any embodiment in the above-mentioned method for generating information is realized when program is executed by processor.
Method and apparatus provided by the embodiments of the present application for generating information, by obtaining target face video, by mesh
Face video input video identification model trained in advance is marked, obtains recognition result corresponding to target face video, wherein view
Frequency identification model is used to characterize the corresponding relationship of face video and recognition result, recognition result be used to indicate face video whether be
Video based on face image synthesis, thus the knowledge using video identification model realization trained in advance to the video of synthesis
Not, the diversity of information generation is improved, meanwhile, the safety of information exchange can be improved.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the application can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating information of the application;
Fig. 3 is the schematic diagram according to an application scenarios of the method for generating information of the application;
Fig. 4 is the flow chart according to another embodiment of the method for generating information of the application;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating information of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the method for generating information of the application or the implementation of the device for generating information
The exemplary system architecture 100 of example.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications can be installed, such as model training class is answered on terminal device 101,102,103
With, video identification class application, web browser applications, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, e-book reading
(Moving Picture Experts Group Audio Layer III, dynamic image expert compress mark for device, MP3 player
Quasi- audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert compression
Standard audio level 4) player, pocket computer on knee and desktop computer etc..When terminal device 101,102,103 is
When software, it may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or software module (such as with
To provide the multiple softwares or software module of Distributed Services), single software or software module also may be implemented into.It does not do herein
It is specific to limit.
When terminal 101,102,103 is hardware, it is also equipped with video capture device thereon.Video capture device can
To be the various equipment for being able to achieve acquisition video capability, such as camera, sensor.User can use terminal 101,102,
Video capture device on 103 acquires video.
Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103
The background server that face video is handled.Background server can divide the data such as the target face video received
The processing such as analysis, and processing result (such as recognition result) can be fed back into terminal device.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module)
It is implemented as single software or software module.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.Particularly, in target face video or generation identification
As a result used data do not need in the case where long-range obtain during, and above system framework can not include network,
And only include terminal device or server.
With continued reference to Fig. 2, the process of one embodiment of the method for generating information according to the application is shown
200.The method for being used to generate information, comprising the following steps:
Step 201, target face video is obtained.
In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information
It crosses wired connection mode or radio connection obtains target face video.Wherein, target face video can be for to it
The face video identified.Face video can be the video for including face image.
It should be noted that the available electronic equipment for communicating connection of above-mentioned executing subject is (such as shown in FIG. 1
Terminal device) send target face video, it is also available to be pre-stored within local target face video.
Step 202, it is right to obtain target face video institute for video identification model target face video input trained in advance
The recognition result answered.
In the present embodiment, based on target face video obtained in step 201, above-mentioned executing subject can be by target person
The video identification model that face video input is trained in advance obtains recognition result corresponding to target face video.Wherein, video is known
Other model is used to characterize the corresponding relationship of face video and recognition result.Recognition result can serve to indicate that face video whether be
Video based on face image synthesis.Recognition result can include but is not limited at least one of following: text, number, symbol, figure
Picture, video.
In some optional implementations of the present embodiment, video identification model can pass through the training of following training step
It obtains: firstly, obtaining training sample set, wherein training sample may include sample face video and for sample face video
The specimen discerning result specimen discerning result marked in advance can serve to indicate that whether sample face video is based on sample face
The video of image synthesis.Then, the sample face video for training sample training sample concentrated is as input, by what is inputted
Specimen discerning result corresponding to sample face video obtains video identification as desired output, using machine learning method training
Model.
Specifically, choosing training sample as an example, can concentrate from training sample, and execute following steps: will be selected
Sample face video input initial model (such as convolutional neural networks (the Convolutional Neural of the training sample taken
Network, CNN), residual error network (ResNet) etc.), obtain recognition result;It will be corresponding to the sample face video that inputted
Desired output of the specimen discerning result as initial model, based on recognition result obtained and specimen discerning as a result, adjustment is first
The parameter of beginning model;Determine that training sample is concentrated with the presence or absence of unselected training sample;In response to there is no unselected
Training sample, initial model adjusted is determined as video identification model.It should be noted that the selection side of training sample
Formula is not intended to limit in this application.Such as can be and randomly select, the clarity for being also possible to preferentially choose Sample video is preferable
Training sample.
Optionally, above-mentioned video identification model can also be obtained by the training of following training step:
Firstly, obtaining training sample set, and training sample set is divided into preset quantity training sample group.
Herein, it can adopt and training sample set is divided into preset quantity training sample group in various manners.On for example,
Training sample set can be divided into preset quantity training sample group by the way of equal part, can also to training sample set into
Row divides, so that the quantitative value of training sample included by each training sample group in preset quantity training sample group is greater than
Equal to preset threshold.It should be noted that above-mentioned preset quantity can be preset by technical staff.
It is then possible to training sample group is chosen from preset quantity training sample group is used as candidate training sample group, with
And it based on candidate training sample group and initial model, executes with drag generation step: for candidate training sample group, will train
The sample face video of sample is as input, using specimen discerning result corresponding to the sample face video inputted as expectation
Output, is trained initial model using machine learning method, obtains initial video identification model;Determine that preset quantity is instructed
Practice and whether there is unselected training sample group in sample group;Unselected training sample group is not present in response to determining,
Based on initial video identification model obtained, video identification model is generated.
Herein, an initial video identification model can be chosen from obtained initial video identification model as view
Frequency identification model, or obtained initial video identification model is handled (fusion), obtain video identification model.
It should be noted that the selection mode of training sample group is not intended to limit in this application.Such as it can be random choosing
It takes, is also possible to preferentially choose the more training sample group of training sample.
Furthermore it is also possible in response to determining that there are unselected training sample groups, from unselected training sample group
Middle training sample group of choosing is as new candidate training sample group, and the initial video identification model that the last time is obtained is made
For new initial model, above-mentioned model generation step is continued to execute.
It should be noted that the above-mentioned executing subject for generating the training step of video identification model can with for giving birth to
Executing subject at the method for information is same or different.If identical, the executing subject of training step can be trained
Trained video identification model is stored in local after to video identification model.If it is different, then the execution master of training step
Trained video identification model can be sent to the method for being used to generate information after training obtains video identification model by body
Executing subject.
In some optional implementations of the present embodiment, after obtaining recognition result, above-mentioned executing subject may be used also
, for the video based on face image synthesis, to generate alarm in response to determination recognition result instruction target face video obtained
Information.Wherein, warning message can include but is not limited at least one of following: text, number, symbol, picture, video.Specifically
, warning message can be the pre-set information for alarm, be also possible to the attribute based on target face video and give birth to
At for alarm information.The attribute of target face video can include but is not limited at least one of following: video name, view
Frequency duration, source video sequence, video distribution time.As an example, warning message can be with are as follows: " please note that the entitled of ' party A-subscriber '
' video of * * * ' ".
With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for generating information of the present embodiment
Figure.In the application scenarios of Fig. 3, target face video (face video of party A-subscriber) 302 is sent to service first by terminal 301
Device 303.Then, server 303 gets target face video 302, and then target face video 302 is inputted training in advance
Video identification model 304 obtains recognition result ("Yes") 305 corresponding to target face video 302.Wherein, video identification mould
Type can be used for characterizing the corresponding relationship of face video and recognition result.Recognition result "Yes" can serve to indicate that face video is
Video based on face image synthesis;Meanwhile it can use "No" as being used to indicate face video and being not based on facial image
The recognition result of the video of synthesis.
The method provided by the above embodiment of the application is pre- by target face video input by obtaining target face video
First trained video identification model obtains recognition result corresponding to target face video, wherein video identification model is used for table
The corresponding relationship of face video and recognition result is levied, recognition result is used to indicate whether face video is based on face image synthesis
Video, so that the identification using video identification model realization trained in advance to the video of synthesis, improves information generation
Diversity, meanwhile, the safety of information exchange can be improved.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for generating information.The use
In the process 400 for the method for generating information, comprising the following steps:
Step 401, target face video is obtained.
In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information
It crosses wired connection mode or radio connection obtains target face video.
It should be noted that step 401 can be realized by the way of similar with the step 201 in previous embodiment.Phase
Ying Di describes the also suitable step 401 that can be used for the present embodiment above with respect to step 201, and details are not described herein again.
Step 402, the image zooming-out network for video identification model target face video input trained in advance obtains extremely
Few two target facial images.
In the present embodiment, video identification model may include image zooming-out network, in turn, based on obtaining in step 401
Target face video, above-mentioned executing subject can by the image zooming-out network of target face video input video identification model,
Obtain at least two target facial images.
It is understood that target face video is substantially the target face that a sequencing according to the time arranges
Image sequence.Image zooming-out network then can extract at least two target persons according to predetermined manner from target human face image sequence
Face image and output.Wherein, predetermined manner can be the pre-set any way of technical staff.As an example, predetermined manner
It can be primary at interval of preset quantity (such as 3) image zooming-out, or mentioned at interval of preset duration (such as 1s)
It takes primary.
Step 403, by the feature extraction of at least two targets facial image obtained difference input video identification model
Network obtains at least two groups characteristics of image.
In the present embodiment, video identification model further includes feature extraction network, and in turn, above-mentioned executing subject can be by step
The feature extraction network of at least two target facial images difference input video identification model obtained in rapid 402, obtains at least
Two groups of characteristics of image.Wherein, feature extraction network can be connected to the network with above-mentioned image zooming-out, for extracting image zooming-out network
The characteristics of image of the target facial image exported and output.It should be noted that above-mentioned at least two groups characteristics of image can be
At least two for characterizing the characteristic image of characteristics of image.
Herein, feature extraction network may include for extracting the structure of characteristics of image (such as convolutional layer), certainly
It may include other structures (such as pond layer), herein with no restrictions.
Step 404, the result of at least two groups characteristics of image input video identification model obtained is generated into network, obtained
Recognition result corresponding to target face video.
In the present embodiment, video identification model can also include that result generates network, and in turn, above-mentioned executing subject can be with
The result of at least two groups characteristics of image input video identification model obtained is generated into network, it is right to obtain target face video institute
The recognition result answered.Wherein, recognition result can serve to indicate that whether face video is the video based on face image synthesis.Know
Other result can include but is not limited at least one of following: text, number, symbol, image, video.As a result generating network can be with
It is connected to the network with feature extraction, at least two groups characteristics of image for being exported based on feature extraction network generates target face
Recognition result corresponding to video.
Herein, as a result generate network may include for generating the structure of result (such as classifier, full articulamentum), when
So can also include other structures (such as Fusion Features network), herein with no restrictions.
In a concrete implementation mode of the present embodiment, as a result generating network may include that the first result generates network
Network is generated with the second result;And above-mentioned executing subject can obtain knowledge corresponding to target face video as follows
Other result:
Firstly, at least two groups of characteristics of image can be inputted the first result by above-mentioned executing subject respectively generates network, obtain
At least two initial recognition results.
Wherein, the first result generates network and can be connected to the network with feature extraction.First result, which generates network, can be used for
Based on one group of characteristics of image, an initial recognition result and output are generated.Initial recognition result can be used for generating target face
Recognition result corresponding to video.Initial recognition result can be used for whether instruction face video is based on face image synthesis
Video.
Secondly, at least two initial recognition result obtained can be inputted the second result by above-mentioned executing subject generates net
Network obtains recognition result corresponding to target face video.
Wherein, the second result, which generates network, can generate network connection with the first result.Second result generates network can be with
The initial recognition result that network is exported is generated based on the first result, generates recognition result corresponding to target face video.?
Here, network is generated for the second result, technical staff can preset inputted initial recognition result and export knowledge
The corresponding relationship of other result.For example, working as inputted initial recognition result to be used to instruction face video is based on facial image
When the video of synthesis, it is the video based on face image synthesis that exported recognition result, which is used to indicate face video,;Alternatively, working as institute
Having at least one initial recognition result to be used to indicate face video in the initial recognition result of input is based on face image synthesis
Video when, exported recognition result be used to indicate face video be the video based on face image synthesis.
As an example, 403 obtained two groups of characteristics of image through the above steps, for first group of characteristics of image therein,
It is " being synthetic video " that first result, which generates the initial recognition result that network generates, for second group of characteristics of image, the first result
Generating the initial recognition result that network generates is " not being synthetic video ", then according to the pre-set corresponding relationship of technical staff,
Second result, which generates network, can be based on above-mentioned two initial recognition result, and it is based on face figure that generation, which is used to indicate face video,
As the recognition result " being synthetic video " of the video of synthesis.
In another concrete implementation mode of the present embodiment, as a result generate network may include Fusion Features network and
As a result sub-network is generated;And above-mentioned executing subject can obtain identification knot corresponding to target face video as follows
Fruit:
Firstly, above-mentioned executing subject can will at least two groups of characteristics of image input feature vector converged network simultaneously, be used for
Characterize the characteristic image of the video features of target face video.
Wherein, Fusion Features network can be connected to the network with feature extraction.Fusion Features network can be used for being inputted
At least two groups characteristics of image merged (such as carry out image co-registration), the video generated for characterizing target face video is special
The characteristic image of sign and output.
Secondly, characteristic image input results obtained can be generated sub-network by above-mentioned executing subject, target person is obtained
Recognition result corresponding to face video.
Wherein, as a result generating sub-network can be connected to the network with Fusion Features.As a result feature can be based on by generating sub-network
It is right to generate target face video institute for the characteristic image for the video features for characterizing target face video that converged network is exported
The recognition result answered.
Above-mentioned two concrete implementation mode gives the mode of two kinds of generation recognition results, and one is be directed at least two groups
Every group of characteristics of image in characteristics of image generates a recognition result, then recognition result corresponding to every group of characteristics of image is merged
For a total recognition result;It is one group of video features that another kind, which is first by least two groups of multi-features, then by melting
It closes obtained video features and obtains recognition result.
Figure 4, it is seen that the method for generating information compared with the corresponding embodiment of Fig. 2, in the present embodiment
Process 400 highlight extract target face video corresponding to target facial image, and be based on extracted target face
Image generates the step of recognition result corresponding to target face video.The scheme of the present embodiment description can be with target as a result,
Target facial image corresponding to face video simplifies the process of video identification as analysis object, improves information generation
Efficiency.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides one kind for generating letter
One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer
For in various electronic equipments.
As shown in figure 5, the present embodiment includes: acquiring unit 501 and input unit for generating the device 500 of information
502.Wherein, acquiring unit 501 is configured to obtain target face video;Input unit 502 is configured to regard target face
Frequency input video identification model trained in advance, obtains recognition result corresponding to target face video, wherein video identification mould
Type can be used for characterizing the corresponding relationship of face video and recognition result, recognition result can serve to indicate that face video whether be
Video based on face image synthesis.
In the present embodiment, for generate information device 500 acquiring unit 501 can by wired connection mode or
Person's radio connection obtains target face video.Wherein, target face video can be the face view to be identified to it
Frequently.Face video can be the video for including face image.
It should be noted that the available electronic equipment for communicating connection of acquiring unit 501 is (such as shown in FIG. 1
Terminal device) send target face video, it is also available to be pre-stored within local target face video.
In the present embodiment, based on target face video obtained in acquiring unit 501, input unit 502 can be by mesh
Face video input video identification model trained in advance is marked, recognition result corresponding to target face video is obtained.Wherein, depending on
Frequency identification model is used to characterize the corresponding relationship of face video and recognition result.Recognition result can serve to indicate that face video is
No is the video based on face image synthesis.Recognition result can include but is not limited at least one of following: text, number, symbol
Number, image, video.
In some optional implementations of the present embodiment, video identification model may include image zooming-out network, spy
Sign extracts network and result generates network;And input unit 502 may include: the first input module (not shown), quilt
It is configured to target face video-input image extracting network, obtains at least two target facial images;Second input module (figure
In be not shown), be configured to by least two targets facial image obtained distinguish input feature vector extract network, obtain at least
Two groups of characteristics of image;Third input module (not shown) is configured to input at least two groups characteristics of image obtained
As a result network is generated, recognition result corresponding to target face video is obtained.
In some optional implementations of the present embodiment, as a result generating network may include that the first result generates network
Network is generated with the second result;And third input module (not shown) can be further configured to: by least two groups
Characteristics of image inputs the first result respectively and generates network, obtains at least two initial recognition results;By obtained at least two
Initial recognition result inputs the second result and generates network, obtains recognition result corresponding to target face video.
In some optional implementations of the present embodiment, video identification model can be trained as follows
It arrives: obtaining training sample set, wherein training sample includes sample face video and the sample that marks in advance for sample face video
This recognition result, specimen discerning result can serve to indicate that whether sample face video is the view based on sample face image synthesis
Frequently;The sample face video for the training sample that training sample is concentrated is right by the sample face video institute inputted as input
The specimen discerning result answered obtains video identification model as desired output, using machine learning method training.
In some optional implementations of the present embodiment, device 500 can also include: generation unit, be configured to
It is the video based on face image synthesis in response to determination recognition result instruction target face video obtained, generates alarm signal
Breath.
The device provided by the above embodiment 500 of the application obtains target face video by acquiring unit 501, then defeated
Enter the video identification model that unit 502 trains target face video input in advance, obtains knowledge corresponding to target face video
Other result, wherein video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate
Whether face video is the video based on face image synthesis, to utilize video identification model realization trained in advance pairing
At video identification, improve information generation diversity, meanwhile, the safety of information exchange can be improved.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the electronic equipment for being suitable for being used to realize the embodiment of the present application
Structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, function to the embodiment of the present application and should not use model
Shroud carrys out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media
611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but
Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.
The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection,
Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores
The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And
In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed
Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet
Include acquiring unit and input unit.Wherein, the title of these units does not constitute the limit to the unit itself under certain conditions
It is fixed, for example, acquiring unit is also described as " obtaining the unit of target face video ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in electronic equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying electronic equipment.
Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are held by the electronic equipment
When row, so that the electronic equipment: obtaining target face video;The video identification mould that target face video input is trained in advance
Type obtains recognition result corresponding to target face video, wherein video identification model is tied for characterizing face video and identification
The corresponding relationship of fruit, recognition result are used to indicate whether face video is the video based on face image synthesis.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (12)
1. a kind of method for generating information, comprising:
Obtain target face video;
The video identification model that the target face video input is trained in advance obtains corresponding to the target face video
Recognition result, wherein the video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used
In instruction face video whether be the video based on face image synthesis.
2. according to the method described in claim 1, wherein, the video identification model includes image zooming-out network, feature extraction
Network and result generate network;And
It is right to obtain the target face video institute for the video identification model that the target face video input is trained in advance
The recognition result answered, comprising:
The target face video input described image is extracted into network, obtains at least two target facial images;
At least two targets facial image obtained is inputted into the feature extraction network respectively, it is special to obtain at least two groups image
Sign;
At least two groups characteristics of image obtained is inputted into the result and generates network, is obtained corresponding to the target face video
Recognition result.
3. according to the method described in claim 2, wherein, it includes that the first result generates network and second that the result, which generates network,
As a result network is generated;And
It is described that at least two groups characteristics of image obtained is inputted into the result generation network, obtain the target face video institute
Corresponding recognition result, comprising:
At least two groups characteristics of image is inputted into first result respectively and generates network, obtains at least two initial identification knots
Fruit;
At least two initial recognition result obtained is inputted into second result and generates network, obtains the target face view
Recognition result corresponding to frequency.
4. according to the method described in claim 1, wherein, training obtains the video identification model as follows:
Obtain training sample set, wherein training sample includes sample face video and marks in advance for sample face video
Specimen discerning is as a result, specimen discerning result is used to indicate whether sample face video is the view based on sample face image synthesis
Frequently;
The sample face video for the training sample that training sample is concentrated is right by the sample face video institute inputted as input
The specimen discerning result answered obtains video identification model as desired output, using machine learning method training.
5. method described in one of -4 according to claim 1, wherein after the acquisition recognition result, the method is also wrapped
It includes:
Indicate that the target face video is the video based on face image synthesis in response to determination recognition result obtained, it is raw
At warning message.
6. a kind of for generating the device of information, comprising:
Acquiring unit is configured to obtain target face video;
Input unit is configured to the video identification model for training the target face video input in advance, obtains the mesh
Mark recognition result corresponding to face video, wherein the video identification model is used to characterize face video and recognition result
Corresponding relationship, recognition result are used to indicate whether face video is the video based on face image synthesis.
7. device according to claim 6, wherein the video identification model includes image zooming-out network, feature extraction
Network and result generate network;And
The input unit includes:
First input module is configured to the target face video input described image extracting network, obtains at least two
Target facial image;
Second input module is configured at least two targets facial image obtained inputting the feature extraction net respectively
Network obtains at least two groups characteristics of image;
Third input module is configured to inputting at least two groups characteristics of image obtained into the result generation network, obtain
Recognition result corresponding to the target face video.
8. device according to claim 7, wherein it includes that the first result generates network and second that the result, which generates network,
As a result network is generated;And
The third input module is further configured to:
At least two groups characteristics of image is inputted into first result respectively and generates network, obtains at least two initial identification knots
Fruit;
At least two initial recognition result obtained is inputted into second result and generates network, obtains the target face view
Recognition result corresponding to frequency.
9. device according to claim 6, wherein training obtains the video identification model as follows:
Obtain training sample set, wherein training sample includes sample face video and marks in advance for sample face video
Specimen discerning is as a result, specimen discerning result is used to indicate whether sample face video is the view based on sample face image synthesis
Frequently;
The sample face video for the training sample that training sample is concentrated is right by the sample face video institute inputted as input
The specimen discerning result answered obtains video identification model as desired output, using machine learning method training.
10. the device according to one of claim 6-9, wherein described device further include:
Generation unit is configured in response to determine that recognition result obtained indicates that the target face video is based on face
The video of image synthesis, generates warning message.
11. a kind of electronic equipment, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method as claimed in any one of claims 1 to 5.
12. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor
Such as method as claimed in any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810643456.8A CN108985178A (en) | 2018-06-21 | 2018-06-21 | Method and apparatus for generating information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810643456.8A CN108985178A (en) | 2018-06-21 | 2018-06-21 | Method and apparatus for generating information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108985178A true CN108985178A (en) | 2018-12-11 |
Family
ID=64541665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810643456.8A Pending CN108985178A (en) | 2018-06-21 | 2018-06-21 | Method and apparatus for generating information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985178A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670444A (en) * | 2018-12-18 | 2019-04-23 | 北京字节跳动网络技术有限公司 | Generation, attitude detecting method, device, equipment and the medium of attitude detection model |
CN110059624A (en) * | 2019-04-18 | 2019-07-26 | 北京字节跳动网络技术有限公司 | Method and apparatus for detecting living body |
CN111611973A (en) * | 2020-06-01 | 2020-09-01 | 广州市百果园信息技术有限公司 | Method, device and storage medium for identifying target user |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160163084A1 (en) * | 2012-03-06 | 2016-06-09 | Adobe Systems Incorporated | Systems and methods for creating and distributing modifiable animated video messages |
CN107862299A (en) * | 2017-11-28 | 2018-03-30 | 电子科技大学 | A kind of living body faces detection method based on near-infrared Yu visible ray binocular camera |
CN107944416A (en) * | 2017-12-06 | 2018-04-20 | 成都睿码科技有限责任公司 | A kind of method that true man's verification is carried out by video |
CN108171207A (en) * | 2018-01-17 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Face identification method and device based on video sequence |
-
2018
- 2018-06-21 CN CN201810643456.8A patent/CN108985178A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160163084A1 (en) * | 2012-03-06 | 2016-06-09 | Adobe Systems Incorporated | Systems and methods for creating and distributing modifiable animated video messages |
CN107862299A (en) * | 2017-11-28 | 2018-03-30 | 电子科技大学 | A kind of living body faces detection method based on near-infrared Yu visible ray binocular camera |
CN107944416A (en) * | 2017-12-06 | 2018-04-20 | 成都睿码科技有限责任公司 | A kind of method that true man's verification is carried out by video |
CN108171207A (en) * | 2018-01-17 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | Face identification method and device based on video sequence |
Non-Patent Citations (1)
Title |
---|
论智: "FaceForensics:一个用于人脸伪造检测的大型视频数据集", 《HTTPS://XW.QQ.COM/CMSID/20180414G1IP9U00》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670444A (en) * | 2018-12-18 | 2019-04-23 | 北京字节跳动网络技术有限公司 | Generation, attitude detecting method, device, equipment and the medium of attitude detection model |
CN110059624A (en) * | 2019-04-18 | 2019-07-26 | 北京字节跳动网络技术有限公司 | Method and apparatus for detecting living body |
CN111611973A (en) * | 2020-06-01 | 2020-09-01 | 广州市百果园信息技术有限公司 | Method, device and storage medium for identifying target user |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108830235A (en) | Method and apparatus for generating information | |
CN108898185A (en) | Method and apparatus for generating image recognition model | |
CN108898186A (en) | Method and apparatus for extracting image | |
CN108509915A (en) | The generation method and device of human face recognition model | |
CN109446990A (en) | Method and apparatus for generating information | |
CN110288049A (en) | Method and apparatus for generating image recognition model | |
CN108154196A (en) | For exporting the method and apparatus of image | |
CN108805091A (en) | Method and apparatus for generating model | |
CN108985257A (en) | Method and apparatus for generating information | |
CN108595628A (en) | Method and apparatus for pushed information | |
CN109101919A (en) | Method and apparatus for generating information | |
CN108989882A (en) | Method and apparatus for exporting the snatch of music in video | |
CN108345387A (en) | Method and apparatus for output information | |
CN108960316A (en) | Method and apparatus for generating model | |
CN109034069A (en) | Method and apparatus for generating information | |
CN108960110A (en) | Method and apparatus for generating information | |
CN108494778A (en) | Identity identifying method and device | |
CN109308490A (en) | Method and apparatus for generating information | |
CN109410253B (en) | For generating method, apparatus, electronic equipment and the computer-readable medium of information | |
CN109344752A (en) | Method and apparatus for handling mouth image | |
CN109871791A (en) | Image processing method and device | |
CN108491823A (en) | Method and apparatus for generating eye recognition model | |
CN108363999A (en) | Operation based on recognition of face executes method and apparatus | |
CN109145783A (en) | Method and apparatus for generating information | |
CN109241934A (en) | Method and apparatus for generating information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |