CN108985178A

CN108985178A - Method and apparatus for generating information

Info

Publication number: CN108985178A
Application number: CN201810643456.8A
Authority: CN
Inventors: 李伟健; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2018-06-21
Filing date: 2018-06-21
Publication date: 2018-12-11

Abstract

The embodiment of the present application discloses the method and apparatus for generating information.One specific embodiment of this method includes: to obtain target face video；The video identification model that target face video input is trained in advance, obtain recognition result corresponding to target face video, wherein, video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate whether face video is the video based on face image synthesis.The embodiment realizes the identification to the video of synthesis, improves the diversity of information generation, meanwhile, the safety of information exchange can be improved.

Description

Method and apparatus for generating information

Technical field

The invention relates to field of computer technology, more particularly, to generate the method and apparatus of information.

Background technique

Currently, realizing that information is shared by shooting video has become information sharing model important in people's life.It is real In trampling, when user can not the face to target person carry out shooting obtain face video when, user may be synthesized by image Technology utilizes the face video of the face image synthesis target person of target person.

It is understood that often bring infringement using the video that image synthesizes, damage the adverse effects such as fairness, because This, the platform for carrying out information sharing can identify such video, and then intercept to it.

Summary of the invention

The embodiment of the present application proposes the method and apparatus for generating information.

In a first aspect, the embodiment of the present application provides a kind of method for generating information, this method comprises: obtaining target Face video；The video identification model that target face video input is trained in advance obtains knowledge corresponding to target face video Other result, wherein video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate Whether face video is the video based on face image synthesis.

In some embodiments, video identification model includes that image zooming-out network, feature extraction network and result generate net Network；And the video identification model for training target face video input in advance, obtain identification corresponding to target face video As a result, comprising: target face video-input image is extracted into network, obtains at least two target facial images；It will be obtained At least two target facial images distinguish input feature vector and extract network, obtain at least two groups characteristics of image；By it is obtained at least Two groups of characteristics of image input results generate network, obtain recognition result corresponding to target face video.

In some embodiments, as a result generating network includes that the first result generates network and the second result generation network；With And at least two groups characteristics of image input results obtained are generated into network, obtain identification knot corresponding to target face video Fruit, comprising: at least two groups of characteristics of image are inputted into the first result respectively and generate network, obtain at least two initial recognition results； At least two initial recognition result obtained is inputted into the second result and generates network, obtains knowledge corresponding to target face video Other result.

In some embodiments, training obtains video identification model as follows: obtaining training sample set, wherein Training sample includes sample face video and the specimen discerning that is marked in advance for sample face video is as a result, specimen discerning result It is used to indicate whether sample face video is the video based on sample face image synthesis；The training sample that training sample is concentrated Sample face video as input, using specimen discerning result corresponding to the sample face video inputted as expectation it is defeated Out, video identification model is obtained using machine learning method training.

In some embodiments, after obtaining recognition result, this method further include: in response to determination identification obtained As a result instruction target face video is the video based on face image synthesis, generates warning message.

Second aspect, the embodiment of the present application provide it is a kind of for generating the device of information, the device include: obtain it is single Member is configured to obtain target face video；Input unit is configured to the video for training target face video input in advance Identification model, obtain target face video corresponding to recognition result, wherein video identification model for characterize face video with The corresponding relationship of recognition result, recognition result are used to indicate whether face video is the video based on face image synthesis.

In some embodiments, video identification model includes that image zooming-out network, feature extraction network and result generate net Network；And input unit includes: the first input module, is configured to target face video-input image extracting network, obtain At least two target facial images；Second input module is configured to distinguish at least two targets facial image obtained Input feature vector extracts network, obtains at least two groups characteristics of image；Third input module is configured at least two groups obtained Characteristics of image input results generate network, obtain recognition result corresponding to target face video.

In some embodiments, as a result generating network includes that the first result generates network and the second result generation network；With And third input module is further configured to: at least two groups of characteristics of image being inputted the first result respectively and generate network, are obtained At least two initial recognition results；At least two initial recognition result obtained is inputted into the second result and generates network, is obtained Recognition result corresponding to target face video.

In some embodiments, device further include: generation unit is configured in response to determine identification knot obtained Fruit indicates that target face video is the video based on face image synthesis, generates warning message.

The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors；Storage dress Set, be stored thereon with one or more programs, when one or more programs are executed by one or more processors so that one or The method that multiple processors realize any embodiment in the above-mentioned method for generating information.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should The method of any embodiment in the above-mentioned method for generating information is realized when program is executed by processor.

Method and apparatus provided by the embodiments of the present application for generating information, by obtaining target face video, by mesh Face video input video identification model trained in advance is marked, obtains recognition result corresponding to target face video, wherein view Frequency identification model is used to characterize the corresponding relationship of face video and recognition result, recognition result be used to indicate face video whether be Video based on face image synthesis, thus the knowledge using video identification model realization trained in advance to the video of synthesis Not, the diversity of information generation is improved, meanwhile, the safety of information exchange can be improved.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that one embodiment of the application can be applied to exemplary system architecture figure therein；

Fig. 2 is the flow chart according to one embodiment of the method for generating information of the application；

Fig. 3 is the schematic diagram according to an application scenarios of the method for generating information of the application；

Fig. 4 is the flow chart according to another embodiment of the method for generating information of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating information of the application；

Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the method for generating information of the application or the implementation of the device for generating information The exemplary system architecture 100 of example.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as model training class is answered on terminal device 101,102,103 With, video identification class application, web browser applications, social platform software etc..

Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, e-book reading (Moving Picture Experts Group Audio Layer III, dynamic image expert compress mark for device, MP3 player Quasi- audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert compression Standard audio level 4) player, pocket computer on knee and desktop computer etc..When terminal device 101,102,103 is When software, it may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or software module (such as with To provide the multiple softwares or software module of Distributed Services), single software or software module also may be implemented into.It does not do herein It is specific to limit.

When terminal 101,102,103 is hardware, it is also equipped with video capture device thereon.Video capture device can To be the various equipment for being able to achieve acquisition video capability, such as camera, sensor.User can use terminal 101,102, Video capture device on 103 acquires video.

Server 105 can be to provide the server of various services, such as to showing on terminal device 101,102,103 The background server that face video is handled.Background server can divide the data such as the target face video received The processing such as analysis, and processing result (such as recognition result) can be fed back into terminal device.

It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software It, can also be with to be implemented as multiple softwares or software module (such as providing multiple softwares of Distributed Services or software module) It is implemented as single software or software module.It is not specifically limited herein.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.Particularly, in target face video or generation identification As a result used data do not need in the case where long-range obtain during, and above system framework can not include network, And only include terminal device or server.

With continued reference to Fig. 2, the process of one embodiment of the method for generating information according to the application is shown 200.The method for being used to generate information, comprising the following steps:

Step 201, target face video is obtained.

In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information It crosses wired connection mode or radio connection obtains target face video.Wherein, target face video can be for to it The face video identified.Face video can be the video for including face image.

It should be noted that the available electronic equipment for communicating connection of above-mentioned executing subject is (such as shown in FIG. 1 Terminal device) send target face video, it is also available to be pre-stored within local target face video.

Step 202, it is right to obtain target face video institute for video identification model target face video input trained in advance The recognition result answered.

In the present embodiment, based on target face video obtained in step 201, above-mentioned executing subject can be by target person The video identification model that face video input is trained in advance obtains recognition result corresponding to target face video.Wherein, video is known Other model is used to characterize the corresponding relationship of face video and recognition result.Recognition result can serve to indicate that face video whether be Video based on face image synthesis.Recognition result can include but is not limited at least one of following: text, number, symbol, figure Picture, video.

In some optional implementations of the present embodiment, video identification model can pass through the training of following training step It obtains: firstly, obtaining training sample set, wherein training sample may include sample face video and for sample face video The specimen discerning result specimen discerning result marked in advance can serve to indicate that whether sample face video is based on sample face The video of image synthesis.Then, the sample face video for training sample training sample concentrated is as input, by what is inputted Specimen discerning result corresponding to sample face video obtains video identification as desired output, using machine learning method training Model.

Specifically, choosing training sample as an example, can concentrate from training sample, and execute following steps: will be selected Sample face video input initial model (such as convolutional neural networks (the Convolutional Neural of the training sample taken Network, CNN), residual error network (ResNet) etc.), obtain recognition result；It will be corresponding to the sample face video that inputted Desired output of the specimen discerning result as initial model, based on recognition result obtained and specimen discerning as a result, adjustment is first The parameter of beginning model；Determine that training sample is concentrated with the presence or absence of unselected training sample；In response to there is no unselected Training sample, initial model adjusted is determined as video identification model.It should be noted that the selection side of training sample Formula is not intended to limit in this application.Such as can be and randomly select, the clarity for being also possible to preferentially choose Sample video is preferable Training sample.

Optionally, above-mentioned video identification model can also be obtained by the training of following training step:

Firstly, obtaining training sample set, and training sample set is divided into preset quantity training sample group.

Herein, it can adopt and training sample set is divided into preset quantity training sample group in various manners.On for example, Training sample set can be divided into preset quantity training sample group by the way of equal part, can also to training sample set into Row divides, so that the quantitative value of training sample included by each training sample group in preset quantity training sample group is greater than Equal to preset threshold.It should be noted that above-mentioned preset quantity can be preset by technical staff.

It is then possible to training sample group is chosen from preset quantity training sample group is used as candidate training sample group, with And it based on candidate training sample group and initial model, executes with drag generation step: for candidate training sample group, will train The sample face video of sample is as input, using specimen discerning result corresponding to the sample face video inputted as expectation Output, is trained initial model using machine learning method, obtains initial video identification model；Determine that preset quantity is instructed Practice and whether there is unselected training sample group in sample group；Unselected training sample group is not present in response to determining, Based on initial video identification model obtained, video identification model is generated.

Herein, an initial video identification model can be chosen from obtained initial video identification model as view Frequency identification model, or obtained initial video identification model is handled (fusion), obtain video identification model.

It should be noted that the selection mode of training sample group is not intended to limit in this application.Such as it can be random choosing It takes, is also possible to preferentially choose the more training sample group of training sample.

Furthermore it is also possible in response to determining that there are unselected training sample groups, from unselected training sample group Middle training sample group of choosing is as new candidate training sample group, and the initial video identification model that the last time is obtained is made For new initial model, above-mentioned model generation step is continued to execute.

It should be noted that the above-mentioned executing subject for generating the training step of video identification model can with for giving birth to Executing subject at the method for information is same or different.If identical, the executing subject of training step can be trained Trained video identification model is stored in local after to video identification model.If it is different, then the execution master of training step Trained video identification model can be sent to the method for being used to generate information after training obtains video identification model by body Executing subject.

In some optional implementations of the present embodiment, after obtaining recognition result, above-mentioned executing subject may be used also , for the video based on face image synthesis, to generate alarm in response to determination recognition result instruction target face video obtained Information.Wherein, warning message can include but is not limited at least one of following: text, number, symbol, picture, video.Specifically , warning message can be the pre-set information for alarm, be also possible to the attribute based on target face video and give birth to At for alarm information.The attribute of target face video can include but is not limited at least one of following: video name, view Frequency duration, source video sequence, video distribution time.As an example, warning message can be with are as follows: " please note that the entitled of ' party A-subscriber ' ' video of * * * ' ".

With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for generating information of the present embodiment Figure.In the application scenarios of Fig. 3, target face video (face video of party A-subscriber) 302 is sent to service first by terminal 301 Device 303.Then, server 303 gets target face video 302, and then target face video 302 is inputted training in advance Video identification model 304 obtains recognition result ("Yes") 305 corresponding to target face video 302.Wherein, video identification mould Type can be used for characterizing the corresponding relationship of face video and recognition result.Recognition result "Yes" can serve to indicate that face video is Video based on face image synthesis；Meanwhile it can use "No" as being used to indicate face video and being not based on facial image The recognition result of the video of synthesis.

The method provided by the above embodiment of the application is pre- by target face video input by obtaining target face video First trained video identification model obtains recognition result corresponding to target face video, wherein video identification model is used for table The corresponding relationship of face video and recognition result is levied, recognition result is used to indicate whether face video is based on face image synthesis Video, so that the identification using video identification model realization trained in advance to the video of synthesis, improves information generation Diversity, meanwhile, the safety of information exchange can be improved.

With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for generating information.The use In the process 400 for the method for generating information, comprising the following steps:

Step 401, target face video is obtained.

In the present embodiment, can lead to for generating the executing subject (such as server shown in FIG. 1) of the method for information It crosses wired connection mode or radio connection obtains target face video.

It should be noted that step 401 can be realized by the way of similar with the step 201 in previous embodiment.Phase Ying Di describes the also suitable step 401 that can be used for the present embodiment above with respect to step 201, and details are not described herein again.

Step 402, the image zooming-out network for video identification model target face video input trained in advance obtains extremely Few two target facial images.

In the present embodiment, video identification model may include image zooming-out network, in turn, based on obtaining in step 401 Target face video, above-mentioned executing subject can by the image zooming-out network of target face video input video identification model, Obtain at least two target facial images.

It is understood that target face video is substantially the target face that a sequencing according to the time arranges Image sequence.Image zooming-out network then can extract at least two target persons according to predetermined manner from target human face image sequence Face image and output.Wherein, predetermined manner can be the pre-set any way of technical staff.As an example, predetermined manner It can be primary at interval of preset quantity (such as 3) image zooming-out, or mentioned at interval of preset duration (such as 1s) It takes primary.

Step 403, by the feature extraction of at least two targets facial image obtained difference input video identification model Network obtains at least two groups characteristics of image.

In the present embodiment, video identification model further includes feature extraction network, and in turn, above-mentioned executing subject can be by step The feature extraction network of at least two target facial images difference input video identification model obtained in rapid 402, obtains at least Two groups of characteristics of image.Wherein, feature extraction network can be connected to the network with above-mentioned image zooming-out, for extracting image zooming-out network The characteristics of image of the target facial image exported and output.It should be noted that above-mentioned at least two groups characteristics of image can be At least two for characterizing the characteristic image of characteristics of image.

Herein, feature extraction network may include for extracting the structure of characteristics of image (such as convolutional layer), certainly It may include other structures (such as pond layer), herein with no restrictions.

Step 404, the result of at least two groups characteristics of image input video identification model obtained is generated into network, obtained Recognition result corresponding to target face video.

In the present embodiment, video identification model can also include that result generates network, and in turn, above-mentioned executing subject can be with The result of at least two groups characteristics of image input video identification model obtained is generated into network, it is right to obtain target face video institute The recognition result answered.Wherein, recognition result can serve to indicate that whether face video is the video based on face image synthesis.Know Other result can include but is not limited at least one of following: text, number, symbol, image, video.As a result generating network can be with It is connected to the network with feature extraction, at least two groups characteristics of image for being exported based on feature extraction network generates target face Recognition result corresponding to video.

Herein, as a result generate network may include for generating the structure of result (such as classifier, full articulamentum), when So can also include other structures (such as Fusion Features network), herein with no restrictions.

In a concrete implementation mode of the present embodiment, as a result generating network may include that the first result generates network Network is generated with the second result；And above-mentioned executing subject can obtain knowledge corresponding to target face video as follows Other result:

Firstly, at least two groups of characteristics of image can be inputted the first result by above-mentioned executing subject respectively generates network, obtain At least two initial recognition results.

Wherein, the first result generates network and can be connected to the network with feature extraction.First result, which generates network, can be used for Based on one group of characteristics of image, an initial recognition result and output are generated.Initial recognition result can be used for generating target face Recognition result corresponding to video.Initial recognition result can be used for whether instruction face video is based on face image synthesis Video.

Secondly, at least two initial recognition result obtained can be inputted the second result by above-mentioned executing subject generates net Network obtains recognition result corresponding to target face video.

Wherein, the second result, which generates network, can generate network connection with the first result.Second result generates network can be with The initial recognition result that network is exported is generated based on the first result, generates recognition result corresponding to target face video.? Here, network is generated for the second result, technical staff can preset inputted initial recognition result and export knowledge The corresponding relationship of other result.For example, working as inputted initial recognition result to be used to instruction face video is based on facial image When the video of synthesis, it is the video based on face image synthesis that exported recognition result, which is used to indicate face video,；Alternatively, working as institute Having at least one initial recognition result to be used to indicate face video in the initial recognition result of input is based on face image synthesis Video when, exported recognition result be used to indicate face video be the video based on face image synthesis.

As an example, 403 obtained two groups of characteristics of image through the above steps, for first group of characteristics of image therein, It is " being synthetic video " that first result, which generates the initial recognition result that network generates, for second group of characteristics of image, the first result Generating the initial recognition result that network generates is " not being synthetic video ", then according to the pre-set corresponding relationship of technical staff, Second result, which generates network, can be based on above-mentioned two initial recognition result, and it is based on face figure that generation, which is used to indicate face video, As the recognition result " being synthetic video " of the video of synthesis.

In another concrete implementation mode of the present embodiment, as a result generate network may include Fusion Features network and As a result sub-network is generated；And above-mentioned executing subject can obtain identification knot corresponding to target face video as follows Fruit:

Firstly, above-mentioned executing subject can will at least two groups of characteristics of image input feature vector converged network simultaneously, be used for Characterize the characteristic image of the video features of target face video.

Wherein, Fusion Features network can be connected to the network with feature extraction.Fusion Features network can be used for being inputted At least two groups characteristics of image merged (such as carry out image co-registration), the video generated for characterizing target face video is special The characteristic image of sign and output.

Secondly, characteristic image input results obtained can be generated sub-network by above-mentioned executing subject, target person is obtained Recognition result corresponding to face video.

Wherein, as a result generating sub-network can be connected to the network with Fusion Features.As a result feature can be based on by generating sub-network It is right to generate target face video institute for the characteristic image for the video features for characterizing target face video that converged network is exported The recognition result answered.

Above-mentioned two concrete implementation mode gives the mode of two kinds of generation recognition results, and one is be directed at least two groups Every group of characteristics of image in characteristics of image generates a recognition result, then recognition result corresponding to every group of characteristics of image is merged For a total recognition result；It is one group of video features that another kind, which is first by least two groups of multi-features, then by melting It closes obtained video features and obtains recognition result.

Figure 4, it is seen that the method for generating information compared with the corresponding embodiment of Fig. 2, in the present embodiment Process 400 highlight extract target face video corresponding to target facial image, and be based on extracted target face Image generates the step of recognition result corresponding to target face video.The scheme of the present embodiment description can be with target as a result, Target facial image corresponding to face video simplifies the process of video identification as analysis object, improves information generation Efficiency.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides one kind for generating letter One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the present embodiment includes: acquiring unit 501 and input unit for generating the device 500 of information 502.Wherein, acquiring unit 501 is configured to obtain target face video；Input unit 502 is configured to regard target face Frequency input video identification model trained in advance, obtains recognition result corresponding to target face video, wherein video identification mould Type can be used for characterizing the corresponding relationship of face video and recognition result, recognition result can serve to indicate that face video whether be Video based on face image synthesis.

In the present embodiment, for generate information device 500 acquiring unit 501 can by wired connection mode or Person's radio connection obtains target face video.Wherein, target face video can be the face view to be identified to it Frequently.Face video can be the video for including face image.

It should be noted that the available electronic equipment for communicating connection of acquiring unit 501 is (such as shown in FIG. 1 Terminal device) send target face video, it is also available to be pre-stored within local target face video.

In the present embodiment, based on target face video obtained in acquiring unit 501, input unit 502 can be by mesh Face video input video identification model trained in advance is marked, recognition result corresponding to target face video is obtained.Wherein, depending on Frequency identification model is used to characterize the corresponding relationship of face video and recognition result.Recognition result can serve to indicate that face video is No is the video based on face image synthesis.Recognition result can include but is not limited at least one of following: text, number, symbol Number, image, video.

In some optional implementations of the present embodiment, video identification model may include image zooming-out network, spy Sign extracts network and result generates network；And input unit 502 may include: the first input module (not shown), quilt It is configured to target face video-input image extracting network, obtains at least two target facial images；Second input module (figure In be not shown), be configured to by least two targets facial image obtained distinguish input feature vector extract network, obtain at least Two groups of characteristics of image；Third input module (not shown) is configured to input at least two groups characteristics of image obtained As a result network is generated, recognition result corresponding to target face video is obtained.

In some optional implementations of the present embodiment, as a result generating network may include that the first result generates network Network is generated with the second result；And third input module (not shown) can be further configured to: by least two groups Characteristics of image inputs the first result respectively and generates network, obtains at least two initial recognition results；By obtained at least two Initial recognition result inputs the second result and generates network, obtains recognition result corresponding to target face video.

In some optional implementations of the present embodiment, video identification model can be trained as follows It arrives: obtaining training sample set, wherein training sample includes sample face video and the sample that marks in advance for sample face video This recognition result, specimen discerning result can serve to indicate that whether sample face video is the view based on sample face image synthesis Frequently；The sample face video for the training sample that training sample is concentrated is right by the sample face video institute inputted as input The specimen discerning result answered obtains video identification model as desired output, using machine learning method training.

In some optional implementations of the present embodiment, device 500 can also include: generation unit, be configured to It is the video based on face image synthesis in response to determination recognition result instruction target face video obtained, generates alarm signal Breath.

The device provided by the above embodiment 500 of the application obtains target face video by acquiring unit 501, then defeated Enter the video identification model that unit 502 trains target face video input in advance, obtains knowledge corresponding to target face video Other result, wherein video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used to indicate Whether face video is the video based on face image synthesis, to utilize video identification model realization trained in advance pairing At video identification, improve information generation diversity, meanwhile, the safety of information exchange can be improved.

Below with reference to Fig. 6, it illustrates the computer systems 600 for the electronic equipment for being suitable for being used to realize the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, function to the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.

I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.；Storage section 608 including hard disk etc.； And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon Computer program be mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination. The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include acquiring unit and input unit.Wherein, the title of these units does not constitute the limit to the unit itself under certain conditions It is fixed, for example, acquiring unit is also described as " obtaining the unit of target face video ".

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in electronic equipment described in above-described embodiment；It is also possible to individualism, and without in the supplying electronic equipment. Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are held by the electronic equipment When row, so that the electronic equipment: obtaining target face video；The video identification mould that target face video input is trained in advance Type obtains recognition result corresponding to target face video, wherein video identification model is tied for characterizing face video and identification The corresponding relationship of fruit, recognition result are used to indicate whether face video is the video based on face image synthesis.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for generating information, comprising:

Obtain target face video；

The video identification model that the target face video input is trained in advance obtains corresponding to the target face video Recognition result, wherein the video identification model is used to characterize the corresponding relationship of face video and recognition result, and recognition result is used In instruction face video whether be the video based on face image synthesis.

2. according to the method described in claim 1, wherein, the video identification model includes image zooming-out network, feature extraction Network and result generate network；And

It is right to obtain the target face video institute for the video identification model that the target face video input is trained in advance The recognition result answered, comprising:

The target face video input described image is extracted into network, obtains at least two target facial images；

At least two targets facial image obtained is inputted into the feature extraction network respectively, it is special to obtain at least two groups image Sign；

At least two groups characteristics of image obtained is inputted into the result and generates network, is obtained corresponding to the target face video Recognition result.

3. according to the method described in claim 2, wherein, it includes that the first result generates network and second that the result, which generates network, As a result network is generated；And

It is described that at least two groups characteristics of image obtained is inputted into the result generation network, obtain the target face video institute Corresponding recognition result, comprising:

At least two groups characteristics of image is inputted into first result respectively and generates network, obtains at least two initial identification knots Fruit；

At least two initial recognition result obtained is inputted into second result and generates network, obtains the target face view Recognition result corresponding to frequency.

4. according to the method described in claim 1, wherein, training obtains the video identification model as follows:

Obtain training sample set, wherein training sample includes sample face video and marks in advance for sample face video Specimen discerning is as a result, specimen discerning result is used to indicate whether sample face video is the view based on sample face image synthesis Frequently；

The sample face video for the training sample that training sample is concentrated is right by the sample face video institute inputted as input The specimen discerning result answered obtains video identification model as desired output, using machine learning method training.

5. method described in one of -4 according to claim 1, wherein after the acquisition recognition result, the method is also wrapped It includes:

Indicate that the target face video is the video based on face image synthesis in response to determination recognition result obtained, it is raw At warning message.

6. a kind of for generating the device of information, comprising:

Acquiring unit is configured to obtain target face video；

Input unit is configured to the video identification model for training the target face video input in advance, obtains the mesh Mark recognition result corresponding to face video, wherein the video identification model is used to characterize face video and recognition result Corresponding relationship, recognition result are used to indicate whether face video is the video based on face image synthesis.

7. device according to claim 6, wherein the video identification model includes image zooming-out network, feature extraction Network and result generate network；And

The input unit includes:

First input module is configured to the target face video input described image extracting network, obtains at least two Target facial image；

Second input module is configured at least two targets facial image obtained inputting the feature extraction net respectively Network obtains at least two groups characteristics of image；

Third input module is configured to inputting at least two groups characteristics of image obtained into the result generation network, obtain Recognition result corresponding to the target face video.

8. device according to claim 7, wherein it includes that the first result generates network and second that the result, which generates network, As a result network is generated；And

The third input module is further configured to:

9. device according to claim 6, wherein training obtains the video identification model as follows:

10. the device according to one of claim 6-9, wherein described device further include:

Generation unit is configured in response to determine that recognition result obtained indicates that the target face video is based on face The video of image synthesis, generates warning message.

11. a kind of electronic equipment, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 5.

12. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Such as method as claimed in any one of claims 1 to 5.