CN110390705A

CN110390705A - A kind of method and device generating virtual image

Info

Publication number: CN110390705A
Application number: CN201810339894.5A
Authority: CN
Inventors: 王丽婧; 辛晓哲; 范典; 王君; 李鲲鹏; 彭飞; 李健涛; 刘建
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2018-04-16
Filing date: 2018-04-16
Publication date: 2019-10-29
Anticipated expiration: 2038-04-16
Also published as: CN110390705B

Abstract

The invention discloses a kind of method and devices for generating virtual image, in order to obtain the virtual image kept strokes with object, the target image including object is only acquired by image capture device, detection characterizes the action message of the object from target image, generates virtual image according to the action message.Since action message can characterize the movement of object, virtual image is obtained based on the action message, the movement of the movement and object that can ensure virtual image is consistent.As it can be seen that generating the scheme of virtual image, without complicated hardware device, only with common image capture device, it can be realized and generate a virtual image kept strokes with object, reduce the cost for generating virtual image.

Description

A kind of method and device generating virtual image

Technical field

The present invention relates to Internet technical fields, more particularly to a kind of method and device for generating virtual image.

Background technique

Many actual application scenarios are currently, there are, need the limb action of object being mapped to virtual image, so that The movement of the movement and object that obtain the virtual image is consistent, such as: somatic sensation television game, animation production etc..By object Before limb action is mapped to virtual image, Kinect technology is generally first used, obtaining indicates the object limb action information, tool Body mode is as follows: launching laser using infrared transmitter (Infrared Projector), before infrared transmitter camera lens Grating, equably project measurement space, measure space in irradiated object (including target body) laser is overflow Reflection forms random scattering hot spot, then acquires each of the measurement space by infrared camera (Infrared Camera) Hot spot is scattered, data processing is carried out to the scattering hot spot and obtains limb action information.

But inventor has found in the implementation of the present invention, when obtaining limb action information using Kinect technology, Need to use many equipment, such as infrared transmitter, grating, infrared camera etc., in addition to this, it is also necessary to empty in specific measurement Between in measure.Therefore, the at high cost of limb action information is obtained using Kinect technology, i.e., by the limb action of object It is mapped to the at high cost of virtual image.

Summary of the invention

Present invention solves the technical problem that being to provide a kind of method and device for generating virtual image, so as to reduce The movement of object is mapped to the cost of virtual image.

For this purpose, the technical solution that the present invention solves technical problem is:

First aspect present invention provides a kind of method for generating virtual image, which is characterized in that the described method includes:

Obtain the target image of image capture device acquisition；

From the target image, the action message of detection characterization object；

Virtual image is generated according to the action message, the movement of the virtual image and the object is consistent.

Optionally, described from the target image, the action message of detection characterization object includes:

Detection characterizes the limb action information of the object from the target image.

Optionally, it is described that characterization is detected from the target image when target image includes an object The limb action information of the object includes:

The target image is handled using preset first model, identifies preset joint in the target image The location information of point, using the location information of the preset artis as the limb action information for characterizing the object；Institute Preset first model is stated to obtain using preset convolutional neural networks algorithm, it is preset in image and image for characterizing The corresponding relationship of the location information of artis.

Optionally, it is described to be detected from the target image when target image includes at least two object The limb action information for characterizing the object includes:

The target image is handled using preset second model, is identified each preset in the target image The location information of artis, preset second model are obtained using preset convolutional neural networks algorithm, are used for table Levy the corresponding relationship of the location information of preset artis in image and image；

The target image is handled using preset third model, is determined each described pre- in the target image If artis between cohesion, the preset third model using partial association field algorithm obtain, be used for table Levy the corresponding relationship of the cohesion in image and image between preset artis；

According to the cohesion between each preset artis, the preset pass for belonging to each object is determined The location information of node, as the limb action information for characterizing the object.

Optionally, described to generate virtual image according to the action message, the virtual image is dynamic with the object It is consistent and includes:

Obtain the corresponding vivid role of the object；

According to the limb action information of the object, determines the limbs model data of the vivid role, generate The virtual image being consistent with the limb action of the object.

Optionally, the limb action information according to the object determines the limbs mould of the vivid role Type data generate the virtual image being consistent with the limb action of the object:

According to the limb action information of the object, determined from preset virtual image library and the object The limbs model data of the corresponding vivid role of limb action information, the preset virtual image library includes object The corresponding relationship of limb action information and the limbs model data of vivid role；

The limbs model data of the identified vivid role is combined, is generated dynamic with the limbs of the object Make to keep the consistent virtual image.

Optionally, it is described according to when the target image includes several video images obtained from target video Action message generates virtual image, and the movement of the virtual image and the object, which is consistent, includes:

According to the limb action information of object described in several described video images, the animation of the virtual image is generated The movement of video, virtual image described in the animated video and object described in the target video is consistent.

Detection characterizes the facial expression information of the object from the target image.

Second aspect of the present invention provides a kind of device for generating virtual image, which is characterized in that described device includes:

Module is obtained, for obtaining the target image of image capture device acquisition；

Detection module, for from the target image, detection to characterize the action message of object；

Generation module, for generating virtual image, the virtual image and the object according to the action message Movement is consistent.

Optionally, the detection module includes:

First detection unit, the limb action information for the detection characterization object from the target image.

Optionally, when the target image includes an object, the detection unit includes:

First identification subelement, for being handled using preset first model the target image, described in identification The location information of preset artis in target image, using the location information of the preset artis as the characterization target The limb action information of object；Preset first model is obtained using preset convolutional neural networks algorithm, is used for table Levy the corresponding relationship of the location information of preset artis in image and image.

Optionally, when the target image includes at least two object, the detection unit includes:

Second identification subelement, for being handled using preset second model the target image, described in identification The location information of each preset artis in target image, preset second model are using preset convolutional Neural net What network algorithm obtained, for characterizing the corresponding relationship of the location information of preset artis in image and image；

First determination unit determines the mesh for handling using preset third model the target image Cohesion in logo image between each preset artis, the preset third model are using partial association field What algorithm obtained, for characterizing the corresponding relationship of the cohesion in image and image between preset artis；

Second determination unit, for according to the cohesion between each preset artis, determination to belong to each institute The location information for stating the preset artis of object, as the limb action information for characterizing the object.

Optionally, the generation module includes:

Acquiring unit, for obtaining the corresponding vivid role of the object；

Generation unit determines the limbs of the vivid role for the limb action information according to the object Model data generates the virtual image being consistent with the limb action of the object.

Optionally, the generation module, further includes:

Third determination unit, for the limb action information according to the object, from preset virtual image library really The limbs model data of the vivid role corresponding with the limb action information of the object is made, it is described preset virtual Vivid library includes the corresponding relationship of the limb action information of object and the limbs model data of vivid role；

Assembled unit, for the limbs model data of the identified vivid role to be combined, generate with it is described The virtual image that the limb action of object is consistent.

Optionally, when the target image includes several video images obtained from target video, the generation module It is specifically used for:

Optionally, the detection module includes:

Second detection unit, the facial expression information for the detection characterization object from the target image.

Third aspect present invention provides a kind of equipment for generating virtual image, which is characterized in that and it include memory, with And one perhaps more than one program one of them or more than one program be stored in memory, and be configured to by It includes for executing such as aforementioned present invention that one or more than one processor, which execute the one or more programs, On the one hand the method provided.

Fourth aspect present invention provides a kind of non-transitorycomputer readable storage medium, when in the storage medium When instruction is executed by the processor of electronic equipment, so that electronic equipment is able to carry out the side provided such as above-mentioned first aspect present invention Method.

According to the above-mentioned technical solution, the method have the advantages that:

In order to obtain the virtual image kept strokes with object, only being acquired by image capture device includes object Target image, detection characterizes the action message of object movement from target image, generates virtual shape according to the action message As.Since action message can characterize the movement of object, virtual image is obtained based on the action message, can be ensured virtual The movement of image and the movement of object are consistent.The scheme of above-mentioned generation virtual image sets without complicated hardware It is standby, only with common image capture device, it can be realized and generate a virtual image kept strokes with object, reduce Generate the cost of virtual image.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is the flow chart of the method provided in an embodiment of the present invention for generating virtual image；

Fig. 2 is a kind of limbs schematic diagram of object；

The schematic diagram of virtual image is generated when Fig. 3 is an object provided in an embodiment of the present invention；

The schematic diagram of virtual image is generated when Fig. 4 is multiple objects provided in an embodiment of the present invention；

Fig. 5 is the structural representation of the device provided in an embodiment of the present invention for generating virtual image；

Fig. 6 is the structural schematic diagram that the equipment of virtual image is generated in the embodiment of the present invention.

Specific embodiment

In order to provide the implementation for generating virtual image of low cost, generated virtually the embodiment of the invention provides a kind of The method and device of image, is illustrated the embodiment of the present invention below in conjunction with Figure of description.

In somatic sensation television game, the application scenarios such as animation production, need to generate virtual image, that is, reflect the movement of object It is mapped on virtual image, makes the movement of virtual image and keeping strokes for object.The movement of object is mapped to virtual shape As preceding, need first to obtain the action message of object, under normal circumstances, obtained using Kinect technology by multiple special equipments It must indicate the action message of object movement.

But when obtaining the action message of object using Kinect technology, need to use many special equipments, it is such as infrared Transmitter, grating, infrared camera etc., in addition to this, it is also necessary to be measured in specifically measurement space.Therefore, it uses Kinect technology obtains the action message of object, and the special equipment being related to is various, complicated for operation；And due to special equipment at This height, so as to cause higher cost needed for generation virtual image.

To solve the above-mentioned problems, the embodiment of the invention provides the technical solution for generating virtual image, only pass through image Acquire equipment (such as: common camera, the camera etc. on ordinary camera, mobile terminal) acquisition target image, the target Image includes object, detects the target image, can be obtained the action message of object, which can characterize the mesh Mark the movement of object.Virtual image is generated based on the action message, it can be ensured that the movement of the virtual image and the movement of object It is consistent, without complicated hardware device, only with common image capture device, can be realized and generate one and mesh The virtual image that mark object is kept strokes, reduces the cost for generating virtual image.

Illustrative methods

Fig. 1 is the method flow diagram provided in an embodiment of the present invention for generating virtual image, as shown in Figure 1, this method comprises:

Step 101, the target image of image capture device acquisition is obtained.

Image capture device is the equipment that can collect image.For example, image capture device can be video camera, shine Camera is also possible to the camera carried on the terminal devices such as mobile phone, computer, etc..

Heretofore described target image, what is referred to is all without other special equipments (such as: infrared transmitter, grating etc.) Auxiliary, only by image capture device obtain image.Such as: using the image in the collected video of video camera, or The photo that person is shot by camera, or, using the video image or photo of the camera acquisition being arranged on terminal device Deng.Wherein, personage, animal or active doll for including in target image etc. are object.

It is understood that the equipment for generating virtual image is realized in the present invention, with image capture device in a kind of situation It can be two self-existent equipment, the two carries out image transmitting by the communication connection established, can establish wire communication Connection, the communication connection such as established using signal transmssion line；Also it can establish wireless communication connection, such as: building using WIFI, bluetooth Vertical communication connection.In another case, generate virtual image equipment and image capture device, can also be integrated in it is same Two different sub- equipment in equipment, the two carry out image transmitting by data transmission channel built-in in equipment.For example, right Both included the camera with image collecting function for mobile phone, and had also included with the processor for generating virtual image function.

In specific implementation, target image is obtained in step 101, there are a variety of possible implementations.

In a kind of implementation, which is the image that image capture device acquires in real time.

In specific implementation, target image can be piece image.In a kind of situation, image capture device is collected individually Piece image be sent to the equipment for generating virtual image and using the image as target image；In another case, image Acquisition equipment collects one section of video, and a frame video image is intercepted from the video, and as target image, it is empty to be sent to generation Intend the equipment of image.

Target image is also possible to multiple image.In a kind of situation, image capture device collects multiple image, and will be more Width image is used as target image, is sent to the equipment for generating virtual image；In another case, image capture device collects One section of video, and multi-frame video image is intercepted from the video, as target image, it is sent to the equipment for generating virtual image.

In another implementation, which is the image obtained from local storage.

Local storage refers to the memory in the equipment for generating virtual image.Image in the local storage, can To be directly to be sent to the image that local storage is stored by image capture device after being acquired by image capture device；? It can be after being acquired by image capture device, be sent to network server, then by local storage as needed from network service The image downloaded in device.

In specific implementation, in a kind of situation, target image can be piece image, generate the equipment of virtual image from its In the image stored in local storage, select piece image as target image；In another case, target image can also be with It is multiple image, generates the equipment of virtual image from the image stored in its local storage, select multiple image as mesh Logo image.Wherein, when target image is multiple image, can be on acquisition time or in storage location it is continuous several Image is also possible to completely unrelated multiple image.

For example, on the server of certain live streaming platform, there is this map office, including 100 width images: image 1, figure As 2 ..., image 100.In a kind of situation, if target image is piece image, 100 width stored from local storage In image, select image 52 as target image；In another case, being deposited if target image is 10 width images from local In 100 width images in reservoir, successively select image 10, image 11 ..., image 20, as target image；If target image It is selected " Quan Xuan ", then from the 100 width images of this map office alternatively, successively selecting the institute in this map office for 100 width images There is image, as target image.

Following step is illustrated by the case where piece image of target image, if target image is multiple image, For each image in multiple image, the process for generating virtual image is operation performed by piece image with target image It is identical, following processes for generating virtual image are only repeated repeatedly, in the present embodiment without repeating.

Step 102, from target image, the action message of detection characterization object.

Step 103, virtual image is generated according to action message, the movement of virtual image and object is consistent.

Under application scenes, the movement of object can be limb action, then from target image, detection characterization mesh Mark object action message include:

The limb action information of detection characterization object from target image.

In the target image got in above-mentioned steps 101, including object.As shown in Fig. 2, the limbs packet of object It includes: hand, forearm, upper arm, trunk, thigh, shank and foot etc..The limb action of object refers to through above-mentioned limbs Coordinating Activity, the movement of formation.Such as: waving, kick, walk, run, stand.

Limb action information refers to the data for describing object limb action.For example, the limb action information can be with It is each preset artis on object limbs, the location information in the target image；Alternatively, the limb action information is also It can be the relative positional relationship on object between each preset limbs, etc..

As an example, when limb action information is the location information of the preset artis of object in the target image When, it is assumed that preset artis has 12, be respectively as follows: left finesse, left hand elbow, left shoulder, right shoulder, right hand elbow, right finesse, left hip, Left knee, left ankle, right hip, right knee, right ankle, then, the limb action information of object are above-mentioned preset the 12 of the object The location information of a artis in the target image, that is, 12 coordinate points in target image corresponding to 12 artis.

As another example, when limb action information is the relative position pass on object between each preset limbs When being, it is assumed that preset limbs have 8, are respectively as follows: left forearm, left upper arm, right forearm, right upper arm, left thigh, left leg, the right side Thigh, right leg, then, the limb action information of object are the opposite position between above-mentioned preset 8 limbs of the object It sets, that is, the relative angle information between 8 limbs.

In specific implementation, the equipment of the generation virtual image detects in target image obtained, the limbs of object Action message.If limb action information is position letter of each preset artis in the target image on object limbs Breath, then, using a point on target image as coordinate origin, establish two-dimensional coordinate system on the target image, the target figure The limb action information of object as in are as follows: all preset artis corresponding two-dimensional coordinate point in this two-dimensional coordinate system (X, Y).

Wherein, limb action information can embody the limb action of object.That is, each two-dimensional coordinate point corresponds to one Preset artis characterizes the limb action of the object of the preset artis in the target image.For example, if obtaining Limb action information in, 6 preset artis --- left finesse, left hand elbow, left shoulder, right shoulder, right hand elbow, right finesse pair 6 coordinate points in the limb action information answered, the numerical value of ordinate are equal, then, and this group of limb action information representation Limb action are as follows: both hands are flat to be lifted.

It is understood that when what is detected from target image is to characterize the limb action information of object, Neng Gougen Virtual image is generated according to the limb action information, the limb action of the virtual image and the object is consistent.

It is so-called " virtual ", it is to " reproduction and reconstruction of real world ".Virtual image is not deposited in a kind of actual life , but can be by visual image symbol familiar to people, in most cases, virtual image refers to game role, such as: lid Human relations, Super Mario etc., alternatively, the Quadratic Finite Element image of cartoon, such as: Little Bear, Doraemon, Sailor Moon.

Virtual image in the present embodiment refers to according to the limb action information that detects, generation in target image The consistent virtual image of the limb action of object.

Wherein, the corresponding vivid role of virtual image can be the equipment by generation virtual image from numerous virtual shapes It is randomly selected in image angle color, for example, at random from a large amount of cartoon figure, select Hello Kitty as will generate The corresponding vivid role of virtual image；Alternatively, the corresponding vivid role of virtual image is also possible to according to limb action information institute The body characteristics selection of the object of characterization, for example, general leg is more slender for model, it can be according to the shape Body characteristics select Sailor Moon as the corresponding vivid role of virtual image that will be generated.In addition, virtual image is corresponding Vivid role is also possible to user and carries out selection according to the hobby or specific demand of itself, for example, user dotes on machine Cat, then it is Doraemon that the corresponding vivid role of the virtual image that will be generated, which can be set,.

The limb action of virtual image, it is similar with the limb action of object, it is also possible to by being preset on vivid role Artis location information come what is indicated, the preset artis of virtual image is identical as the preset artis of object.Alternatively, The limb action of virtual image can also be through the relative positional relationship between the upper each preset limbs of vivid role come table Show, the preset limbs of virtual image are identical as the preset limbs of object.

The limb action of virtual image and the limb action of object are consistent, and refer to each limbs phase of virtual image Position and angle for the virtual image, it is consistent relative to the position of object and angle with each limbs of object.That is, The each limbs of virtual image correspond to the directional information in some canonical reference direction (such as: horizontal direction), corresponding with object Limbs are identical relative to the directional information in the canonical reference direction.

For example, tentative standard reference direction is horizontal direction, if detecting the limbs of object in target image In action message, the directional information including right upper arm and right forearm are as follows: 90 under using horizontal direction as the right avertence in canonical reference direction Degree, indicates the corresponding limb action of right arm of object are as follows: sagging；Then, the right upper arm of virtual image and right forearm are corresponding Limb action information are as follows: 90 degree under using horizontal direction as the right avertence in canonical reference direction, the same right arm for indicating virtual image Corresponding limb action are as follows: sagging.And so on, the directional information of each limbs of virtual image is enabled, limb corresponding with object The directional information of body is identical, and the limb action of virtual image generated and the limb action of object can be made to be consistent.

It is understood that in network direct broadcasting, on the platforms such as psychotherapy, if user be not desired to reveal itself face to Other side can use method provided in this embodiment, generate a virtual image, and virtual image substitution user is allowed to face other side, The limb action of the virtual image of the generation and the limb action of user are consistent in real time, and other side can be allowed to pass through virtual shape The limb action of elephant judges the limb action of user, and carries out to the variation of the limb action and limb action of virtual image Reasoning and judgement, and then understand the behavior of user.

The method provided through this embodiment provides target image merely with image capture device, can be completed target The limb action of object is mapped to virtual image in image, generates the consistent virtual image of limb action with object.It can See, the mapping compared to the limb action using special equipment realization object to virtual image, generation provided in this embodiment The method of virtual image reduces the cost for generating virtual image, keeps the generation of virtual image simpler, convenient.

When specific implementation, mesh can be detected from target image by preset model in the equipment of generation virtual image Mark the limb action information of object.According to the difference of object quantity in target image, the equipment of the generation virtual image is for examining It is also not identical to survey preset model used by object limb action information.

As an example, when target image includes an object, the equipment of virtual image is generated using preset First model is used to detect the limb action information of object.

Wherein, preset first model is the pre-set model of equipment for generating virtual image.This preset Using convolutional neural networks algorithm in one model, input as target image, export in the target image object it is pre- If artis location information, therefore, preset first model is default for characterizing object in target image and target image Artis location information between corresponding relationship.

Preset first model is to obtain preset convolutional neural networks after a large amount of training sample is trained The model arrived.I.e., it is known that the first training image of a convolutional neural networks and magnanimity, in first training image only It is marked including a personage, and by the location information of the preset artis of personage in the first training image, as known bits confidence Breath.Specific training process are as follows: respectively with every first training image in all first training images, as preset convolution The input of neural network, by the known position information of the preset artis of the corresponding personage of first training image, as pre- If convolutional neural networks output, constantly preset convolutional neural networks are adjusted, training after the completion of resulting volume Product neural network, as preset first model.

It when including preset first model, and include one in the target image got in the equipment for generating virtual image When a object, the limb action information of detection characterization object may include: from target image

Step 102A is handled target image using preset first model, preset pass in recognition target image The location information of node, using the location information of preset artis as the limb action information of characterization object.

In specific implementation, target image is input to preset first mould trained by the equipment for generating virtual image Type handles target image by preset first model, exports the preset artis of object in the target image Location information, the location information of the preset artis of the output, the limb action information of the object as detected.

It is understood that using preset first model inspection object limb action information when, the target of acquisition The preset artis of object is identical with the type of preset artis in preset first model, e.g., preset first mould Include: in the result exported in type left finesse, left hand elbow, left shoulder, right shoulder, right hand elbow, right finesse, left hip, left knee, left ankle, The location information of right hip, right knee, right ankle totally 12 preset artis, then, and the limb of object in the target image detected Body action message also only includes the location information of above-mentioned 12 preset artis.

As an example it is assumed that target image is image 1, it include an object first in image 1, preset first model is The model 1 trained.Now image 1 is input in model 1, exports the position letter of 12 preset artis of first in image 1 Breath, that is, (X₁, Y₁), (X₂, Y₂)、…、(X₁₂, Y₁₂).The location information of this 12 preset artis is as examined from image 1 The limb action information of the first measured can determine the limb action of first according to this 12 coordinate values.

If only including an object in target image, at this point, can determine target by preset first model The location information of the preset artis of object in image, as the limb action information for detecting object, the limbs are dynamic The limb action of object can be characterized by making information, be got ready to be subsequently generated virtual image.

As another example, when target image includes at least two object, the equipment for generating virtual image is used Preset second model and preset third model realize the detection of the limb action information for object.

Wherein, preset second model is the pre-set model of equipment for generating virtual image.This preset Using convolutional neural networks algorithm in two models, inputs as target image, export as objects multiple in the target image All preset artis location information, therefore, preset second model, for characterizing mesh in target image and target image Mark the corresponding relationship between the location information of the preset artis of object.

From to preset first model be trained used by training sample it is different, preset second model is instructed The second training image in training sample used by practicing, is the image including multiple trained objects；In second training image The location information of preset artis, for the position of the preset artis of training object all in the second training image for marking Information.But the training process of preset second model with it is above-mentioned almost the same to the training process of preset first model, Which is not described herein again.Wherein, training object can be personage, animal or active doll etc..

It include more in the equipment for generating virtual image when including preset second model, and in the target image got When a object, step 102 may include:

Step 102B1 is handled target image using preset second model, each default in recognition target image Artis location information, preset second model is obtained using preset convolutional neural networks algorithm, for characterizing The corresponding relationship of the location information of preset artis in image and image.

In specific implementation, target image is input to preset second mould trained by the equipment for generating virtual image Type handles target image by preset second model, exports the preset artis of object in the target image Location information.

It is understood that assume that the quantity of each preset artis of object in preset second model is n, if It include m (m > 1) a object in the target image, then, this passes through preset second model, all in the target image of acquisition The location information of the preset artis of object is n*m, and n preset artis on each object are identical Artis.

As an example it is assumed that target image is image 2, it include object first and second in image 1, preset second model is The model 2 trained.Now image 2 is input in model 2, exports the position of preset 24 artis of first and second in image 2 Confidence breath, that is, (X₁, Y₁), (X₂, Y₂)、…、(X₂₄, Y₂₄)。

Clearly as include multiple objects in target image, so, pass through preset second mould in step 102B1 Type obtains the location information that artis is preset in target image, is not to belong to the coordinate value of the artis of multiple objects, and not Can determine which coordinate value is which target in target image belonged to by multiple coordinate values of default artis obtained The preset artis of object, that is, can not determine the limb action information for characterizing each object in multiple objects.

At this point, the equipment of the generation virtual image is also needed using preset third model, it will be obtained all default The location information of artis classify, identify the location information for belonging to the preset artis of the same object.

Wherein, preset third model is the pre-set model of equipment for generating virtual image.This preset Using partial association field algorithm in three models, inputs as target image, export as joint preset in the target image Cohesion between point, therefore, preset third model, for characterizing the preset joint of object in target image and target image The corresponding relationship of cohesion between point.

Cohesion between preset artis, for characterizing the intimate degree between artis and artis, general feelings Under condition, the value of cohesion is between 0~1.The size of cohesion illustrates the corresponding two preset artis categories of the cohesion Size in the same object a possibility that.

The preset third model is to be trained preset partial association field algorithm by a large amount of training sample The model obtained afterwards.I.e., it is known that the second training image of a partial association field algorithm and magnanimity, the second training figure Include multiple personages as in, and the cohesion in the second training image between the preset artis of training object marked, As the known cohesion between preset artis.Specific training process are as follows: respectively in all second training images Every second training image, as the input of preset partial association field algorithm, by the corresponding training of the second training image Known cohesion between the preset artis of object is continuous right as the output of preset partial association field algorithm Preset partial association field algorithm is adjusted, resulting partial association field algorithm after the completion of training, and as preset the Three models.

It include more in the equipment for generating virtual image when including preset third model, and in the target image got When a object, after step 102B1, may include:

Step 102B2 is handled target image using preset third model, is determined each default in target image Artis between cohesion.Preset third model is obtained using partial association field algorithm, for characterizing image And the corresponding relationship of the cohesion in image between preset artis.

In specific implementation, target image is input to the preset third mould trained by the equipment for generating virtual image Type handles target image by the preset third model, exports the preset pass of all objects in the target image Cohesion between node, for indicating that each preset artis and other preset artis belong to the same object A possibility that size.

It is understood that when using cohesion between the preset preset artis of third model inspection object, Artis preset for each of target image can get and all default joints in addition to the preset artis Cohesion between point, that is, the cohesion quantity that each preset artis obtains is fewer than the quantity of all preset artis One.

As an example it is assumed that target image is image 2, it include object first and second in image 2, preset third model is The model 3 trained.Now image 2 is input in model 3, exports in image 2 first and second between totally 24 preset artis Cohesion a_{I, j}, wherein i=1,2 ..., 24, j=1,2 ..., 24, a_{I, j}Indicate preset artis i and preset artis Cohesion between j.

It should be noted that step 102B2 can also be executed before step 102B1, alternatively, can also be with step 102B1 is performed simultaneously, and does not affect the realization of the present embodiment.

When identifying the corresponding all preset artis of all objects in target image using preset second model, And the cohesion in target image between each all preset artis of object is determined using preset third model, it can The location information of the artis of each object in target image is determined, as inspection using the above-mentioned two parts data obtained The limb action information for the object measured.In the implementation, further includes:

Step 102B3, according to the cohesion between each preset artis, determination belongs to the preset of each object The location information of artis, the limb action information as characterization object.

In specific implementation, it is executed using above-mentioned steps 102B1 and step 102B2 obtained as a result, determining target In image in all preset artis of object, which the preset artis for belonging to each object is, that is, determines and belongs to In the location information of the preset artis of each object.And the location information of the preset artis of each object, it can To characterize the limb action information of the object.

As an example, all preset artis constructions can be helped into connection figure, is divided with most authority two matched Mode determines the location information for belonging to the preset artis of the same object.

As it can be seen that when in target image including multiple objects, by preset second model and preset third model, In conjunction with the output obtained of two models as a result, can determine that the limb action of object in target image.

No matter target image includes an object or multiple objects, can use and is generating setting for virtual image Preset model in standby, detects in target image, for characterizing the limb action information of object, to be subsequently generated virtual shape As getting ready.

After detecting the limb action information of object, the i.e. executable step 103 of equipment of virtual image is generated, according to The limb action information detected generates virtual image.In specific implementation, above-mentioned steps 103 in specific implementation, step 103 It can specifically include:

Obtain the corresponding vivid role of object；

According to the limb action information of object, the limbs model data of image role is determined, generate the limb with object Body acts the virtual image being consistent.

Wherein, the limbs model of vivid role refers to the minimum extremity body structures for the limb action that can indicate vivid role Model, there is the preset artis of direct connection relational and its connecting line segment to constitute by two.For example, if object pair The vivid role answered is Little Bear, then the limbs model of the corresponding vivid role of the object is divided into: before left forearm, left upper arm, the right side Arm, right upper arm, left thigh, left leg, right thigh, right leg.Wherein, the right forearm of the Little Bear is corresponding by the right finesse of the Little Bear Preset artis, line between the corresponding preset artis of right hand elbow and two preset artis constitutes.

The equipment for generating virtual image can carry out the combination of different location by the limbs model to vivid role, generate Virtual image including different limb actions.It, can be with for example, the right forearm and right upper arm of Little Bear be to the right and close to horizontal direction In the limb action for determining the Little Bear generated, the limb action of the right arm of Little Bear are as follows: right arm is lifted to level.

In some possible implementations, according to the limb action information of object, the limbs mould of image role is determined Type data generate the virtual image being consistent with the limb action of object, specifically may is that

According to the limb action information of object, the limb action with object is determined from preset virtual image library The limbs model data of the corresponding vivid role of information, preset virtual image library includes the limb action information and shape of object The corresponding relationship of the limbs model data of image angle color；

The limbs model data of identified vivid role is combined, generates and keeps one with the limb action of object The virtual image of cause.

It is understood that a virtual image library can be pre-established, for store the limb action information of personage with Corresponding relationship between the limbs model data of each image role.That is, including that multiple groups are following in the preset virtual image library Corresponding relationship, the corresponding relationship are limb action information, vivid role, and the limb of limb action information is corresponded in image role Body Model data.Therefore, for role vivid for one, there are the limbs model data of all angles, correspond to it is a variety of not Same limb action information.It is understood that again in the same corresponding relationship, the limbs type phase of object and vivid role Together.It for example, being all arm, or is all leg etc..

For example, including the data of the arm of the people under different movements in the preset virtual image library, with Sailor Moon In the corresponding relationship of the arm model data of all angles；There is also the data of the leg of the people under different movements, fight with beautiful young girl Corresponding relationship of the scholar in the leg model data of all angles.Again for example: further including difference movement in the preset virtual image library Under people arm data, with Winnie the Pooh the arm model data of all angles corresponding relationship；There is also different movements Under people leg data, with Winnie the Pooh the leg model data of all angles corresponding relationship.

When specific implementation, according to the limb action information of object, the first step is looked into from the virtual image library pre-established Look for the first data acquisition system relevant to the vivid role selected；Second step, for each of the limb action information of object Limb action information, from the first data acquisition system found, determination is relevant to the corresponding limbs of limb action information The second data set；Third step, from the second data set, according to the limb action information of the object with vivid role's Corresponding relationship between limbs model data searches be somebody's turn to do corresponding with limb action information (such as: angle is consistent) of the object The limbs model data of vivid role.For each limbs of virtual image, it is performed both by above-mentioned second step and third step, is determined The corresponding limbs model data of each limbs, finally, by the corresponding limbs model data combination of identified all limbs one It rises, generates the virtual image being consistent with the limb action of the object.

As an example it is assumed that selected vivid role is Sailor Moon, the limb action information of object is got are as follows: Attention position, firstly, obtaining the first data relevant to Sailor Moon in virtual image library；For left arm, in acquisition In the first data relevant to Sailor Moon, the second data relevant to left arm are searched, then in the second number relevant to left arm In, the left arm corresponding limbs model data of posture vertically downward is determined；Similarly, for right arm, in fighting with beautiful young girl for acquisition In relevant first data of scholar, the second data relevant to right arm are searched, then in the second data relevant to right arm, determined right The arm corresponding limbs model data of posture vertically downward；And so on, determine all limbs and object of Sailor Moon The consistent limbs model data of limb action.Finally, identified all limbs model datas are combined, one is generated The Sailor Moon of a " attention position ".

In other possible implementations, in a kind of situation, when in target image including an object, step 103 specifically include: the first step, obtain the corresponding vivid role of object；Second step, it is dynamic according to the limbs of the object detected Make information, that is, the location information of the preset artis of object determines the corresponding limbs model data of image role；The Three steps determine the virtual image generated according to limbs model data, as the consistent virtual with object limb action of generation Image.

For example, as shown in Figure 3, it is assumed that being provided with the corresponding vivid role of object first is Little Bear.Firstly, obtaining small The limbs model of bear: left forearm, left upper arm, right forearm, right upper arm, left thigh, left leg, right thigh, right leg.According to detection To target image in object first limb action information A1 (X₁, Y₁), A2 (X₂, Y₂)、…、A12(X₁₂, Y₁₂), determine first Right arm lifted to upper right side, be in 15 degree with horizontal direction, left arm to the left hang down by lower section, with horizontal direction in 45 degree, etc..Into And it can determine the data of Little Bear corresponding left arm and right arm, that is, the right arm of Little Bear is lifted to upper right side, is in horizontal direction 15 degree, it is in 45 degree with horizontal direction, and determine other limbs model datas of Little Bear that left arm to the left hang down by lower section.Such as Fig. 3 Shown in the image on right side, according to all limbs model datas of the vivid role determined, generate consistent with the limb action of first Little Bear.

In another case, when in target image including multiple objects, the specific implementation process of step 103 are as follows: the One step obtains the limbs model of the corresponding vivid role of each object in n object；Second step, according to the limb detected The limb action information of first aim object in body action message determines the limbs mould of the corresponding vivid role of first aim object Type data；Third step determines the virtual image generated according to limbs model data, as generation and first aim object limb First virtual image that body is kept strokes；Similarly, for each object, it is carried out the operation of second step and third step, directly It is all determined to by the corresponding virtual image of n object.

For example, as shown in figure 4, including object first and second in the target image 2.Assuming that it is corresponding to preset first Vivid role be Little Bear, the corresponding vivid role of second is Sailor Moon.Firstly, obtaining the limbs of Little Bear and Sailor Moon Model includes: left forearm, left upper arm, right forearm, right upper arm, left thigh, left leg, right thigh, right leg.According to detecting First limb action information B1 (X₁, Y₁), B2 (X₂, Y₂)、…、B12(X₁₂, Y₁₂), using implementation shown in Fig. 3, determine The location information of the limbs model of Little Bear out, the consistent Little Bear of limb action with first of generation.Then, according to the second detected Limb action information C1 (Z₁, W₁), C2 (Z₂, W₂)、…、C12(Z₁₂, W₁₂), determine the limbs pattern number of Sailor Moon According to, and then generate and the consistent Sailor Moon of second limb action.Finally, by after the Sailor Moon of generation and Little Bear combination, Determine the virtual image generated according to target image 2.

In addition, if the proportionate relationship between each limbs of the vivid role of setting, corresponding each with object When proportionate relationship between limbs is consistent, virtual image is generated according to the limb action information detected in step 103, can be had Body includes: by the preset artis of object, and three preset artis on the same line, do not form the first triangle Shape；By in the corresponding vivid role of virtual image, corresponding three preset artis form the second triangle；Pass through One triangle and the second triangle are similar, and the likelihood ratio is it is known that in turn can be according to the corresponding vivid role's of the virtual image In three preset artis, the location information of some known preset artis determines other two preset artis Location information.Similarly, all preset joints in the corresponding vivid role of virtual image can be determined according to the implementation The location information of point, and then generate the virtual image being consistent with the limb action of object.

As an example it is assumed that the location information of three artis in the preset artis of object are as follows: A₁(25,185), B₁(50,180), C₁(70,190), A₁→B₁→C₁It is sequentially connected the right arm of composition object.User is according to hobby, the void of setting Intending the corresponding vivid role of information is Little Bear, and corresponding three artis of the right arm of Little Bear correspond to A₂、B₂、C₂, wherein A₂'s Coordinate is (25,185), and the upper limb ratio of known target object and Little Bear is 5:1.At this point, according to △ A₁B₁C₁With △ A₂B₂C₂Phase Seemingly, and the likelihood ratio is 5, and B can be calculated₂Coordinate be (30,184), C₂Coordinate be (34,186).And so on, it can be with It determines the location information of all preset artis of the virtual image, and then generates virtual image.

It is understood that it is raw still to can use above-mentioned implementation when in target image including multiple objects At virtual image, that is, each vivid role can be made to the vivid role of each of virtual image, progress aforesaid operations Corresponding object limb action is consistent, and which is not described herein again.

It should be noted that the virtual image that the present embodiment generates, can be limb action and object is consistent virtual Image itself, be also possible to include virtual image image.It wherein, should be including in the image of virtual image, the background of image can To be set by the user, the background that the equipment for being also possible to generate virtual image is defaulted can also continue to use the back in target image Scape does not limit specifically in the present embodiment.

Above description is the explanation carried out based on the scene for getting a width target image, in practical application, There certainly exist the target image got be target video in obtain several video images the case where.It is for target image The case where several video images obtained in target video, needs successively to execute step to video image all in target video Rapid 102 operation, that is, detect the limb action information of wherein object respectively for every width video image.

In detecting target video after all video images, virtual image, virtual image are generated according to action message Movement with object, which is consistent, can specifically include:

According to the limb action information of several target in video image objects, the animated video of virtual image, animation view are generated The limb action of virtual image and object in target video is consistent in frequency.

In specific implementation, the process of the animated video of virtual image is generated are as follows: the first step, for every in target video Width video image determines the limb action information of object detected by all video images；Second step, according to the determined The limb action information of one width target in video image object generates corresponding the first width including virtual image of the width video image Animated image, wherein the limb of the limb action of the virtual image in the first animated image and the object in the first video image Body is kept strokes；Similarly, to determining every width video image, the operation of above-mentioned second step is executed, the institute in target video Some video images generate corresponding animated image.Third step, according to the corresponding video image of animated image in target video In time sequencing, several animated image combined animation videos that second step is generated, which is by target video Generate the animated video including virtual image.

It is understood that generate animated video every width animated image in virtual image limb action, with this The limb action of the corresponding target in video image object of width animated image is consistent.

Using method provided in this embodiment, server is achievable by it to every width video image in target video In the limb action of object be mapped to virtual image, acquisition includes and the consistent virtual image of object limb action Animated image realizes the mapping of target video to animated video in turn.As it can be seen that realizing object compared to using special equipment Limb action to virtual image mapping, and then generate animated video mode, generation animated video provided in this embodiment Method, reduce the cost for generating animated video, also, keep the generation of animated video simpler, convenient.

In other application scenarios, the movement of object can also be facial expression, also require the virtual image generated Facial expression it is consistent with the facial expression of object.

For example, in certain live scenes, main broadcaster replaces the main broadcaster to face spectators using the virtual image generated, only allows void It is inadequate that the limb action of the limb action and the main broadcaster of intending image, which is consistent, it is also necessary to the facial expression of the virtual image Also consistent with the facial expression of the main broadcaster, in this way, spectators can pass through the virtual image of generation, the mood of the real-time awareness main broadcaster And expression, promote the viewing experience of spectators.

In some implementations, from target image, the action message of detection characterization object includes:

From target image, the facial expression information of detection characterization object.

Wherein, facial expression information can be there are many avatar.In some examples, facial expression information can be target The status data of the face of object, such as: eyes: it is closed, narrows eye, opens wide；Mouth: beep mouth, close lightly mouth, open one's mouth, the corners of the mouth raises up.Separately In some examples, facial expression information is also possible to several characteristic points and feature of the face obtained using face detection algorithms The location information of point.In still other example, facial expression information can also be the location information of the characteristic point of face, and corresponding Status data, such as: eyes: [(I₁, J₁)、(I₂, J₂)、(I₃, J₃)、(I₄, J₄), closure], wherein " (I₁, J₁)、(I₂, J₂)、 (I₃, J₃)、(I₄, J₄) " be eyes 4 characteristic points position coordinates, " closed " indicate eyes state.

As an example, the equipment of the generation virtual image can be by preset machine learning model, to target figure The face of object as in is detected and is identified, the facial expression information of the object is obtained.

Preset machine learning model is the facial expression information that personage in the image can be determined by image Model.The input of the preset machine learning model is image, is exported as the facial expression information of personage in the image.The face Expression information can be used for characterizing the facial expression of personage, such as: it smiles, laugh, sadness.

The preset machine learning model is pre-set in the equipment for generating virtual image, and passes through magnanimity The model that training sample obtains after being trained.Wherein, training sample is many third training image and every width third instruction Practice the facial expression information of the corresponding object of image.I.e., it is known that the third of initial a machine learning model and magnanimity Facial expression information known to object in training image and every width third training image.Specific training process are as follows: respectively with Every width third training image in all third training images instructs the third as the input of initial machine learning model Practice facial expression information known to the corresponding object of image as output, constantly to preset initial machine learning model into Row adjustment, resulting machine learning model after the completion of training, as preset machine learning model.

In specific implementation, target image is inputted to the preset machine learning model trained, i.e., the exportable mesh The facial expression information of object in logo image.

For example, for target image 1, it is assumed that preset machine learning model is model 4.Then, by target figure It is input in model 1 as 1, exports the facial expression information of the object first in target image 1: (I₁, J₁)、(I₂, J₂)、…、 (I₅₀, J₅₀).It is understood that the coordinate points of above-mentioned facial expression information are distributed in five of object first in target image 1 At official.The current facial expression of first can be known by the facial expression information of the above-mentioned acquisition of analysis.

In obtaining target image after the facial expression information of object, virtual image is handled according to facial expression information, It is consistent the facial expression of virtual image and object.

Facial expression refers to the variation of the state by face such as eyes, mouthes, is one to show various emotional states The highly important non-verbal communication means of kind.

The facial expression of virtual image and the facial expression of object are consistent, and refer to that the face of virtual image is showed The mood of virtual image out and the facial expression of object are identical.For example, in target image object facial expression are as follows: it is micro- It laughs at, the facial expression of the virtual image of generation are as follows: smile, then, indicate that the facial expression of virtual image and object keeps one It causes.

In specific implementation, according to the facial expression information of the object detected, the present embodiment can be generated Virtual image is further processed, and keeps the facial expression of virtual image consistent with the facial expression of object.

For example, for the virtual image of generation 1, the current face of the corresponding object first of virtual image 1 is detected Portion's expression information is (I₁, J₁)、(I₂, J₂)、…、(I₅₀, J₅₀), and can according to analyzing the facial expression information detected Know, the current face expression of first are as follows: sad.So, according to the facial expression information (I detected₁, J₁)、(I₂, J₂)、…、 (I₅₀, J₅₀), virtual image 1 is further processed, virtual image 2 is obtained.Wherein, the facial expression of virtual image 2 is Sadness, it is consistent with the current facial expression of first.

It should be noted that step 104 can execute at any moment between step 101~step 105.Step 105 It can execute, can also be executed with while executing step 103 after step 103.Step 104 and step 105 are specifically at what Shi Zhihang is not especially limited in the present embodiment.

It is understood that step 104 and step 105 can be when executing for a width target image, it is also possible to more Width target image can also be for several video images in target video；Certainly, it can be one in every width target image Object is also possible to multiple objects.Specifically a few width target images, there are several objects in every width target image, and The implementation of the present embodiment is not influenced.

In addition, generating the consistent virtual image of facial expression with object, can also specifically there are other implementations, Such as: as an example, the equipment for generating virtual image can also pre-establish virtual image expression library, for being stored in advance Face data of the vivid role under different expression types.

It is understood that in order to keep the facial expression of the virtual image generated consistent with the facial expression of object, it can With elder generation from target image, the expression type (such as smile, is surprised) of the facial expression of detection characterization object；Then, from pre- Third data relevant to selected vivid role are searched in the virtual image expression library first established；Then, from being found In third data, the face data under the expression type corresponding with detected expression type are determined；Finally, will be identified Face data generate consistent described virtual with the facial expression of the object as the face data for generating the virtual image Image.

As an example it is assumed that selected vivid role is Sailor Moon, firstly, the expression type of detection object, tool Body are as follows: smile；Then, data relevant to Sailor Moon are obtained in virtual image expression library；Then, in acquisition and beauty In the relevant data of maiden soldier, " smile " corresponding face data are searched；Finally, being generated using identified face data The Sailor Moon of " smile ", that is, generate and the consistent virtual image of the facial expression of object.

Using method provided in this embodiment, to the limb action of the object in target image be mapped to virtual image it Afterwards, also the virtual image of generation is further processed using the facial expression of the object in target image, is obtained most Whole virtual image.Mapping compared to the limb action using special equipment realization object to virtual image, the present embodiment Method the facial expression of object is also mapped onto the virtual image of generation, reduce generate virtual image cost it is same When, enhance user experience.

Device example

Referring to Fig. 5, the device that one of embodiment of the present invention generates virtual image is shown.In the present embodiment, should Device includes:

Module 501 is obtained, for obtaining the target image of image capture device acquisition；

Detection module 502, for from the target image, detection to characterize the action message of object；

Generation module 503, for generating virtual image, the virtual image and the object according to the action message Movement be consistent.

Optionally, the detection module includes:

Optionally, the generation module includes:

Acquiring unit, for obtaining the corresponding vivid role of the object；

Optionally, the generation module, further includes:

Optionally, the detection module includes:

The present embodiment is the corresponding Installation practice of embodiment of the method for above-mentioned generation virtual image, specific implementation with And the technical effect reached, it can be with reference to the description of the embodiment of the method for above-mentioned generation virtual image, which is not described herein again.

Referring to Fig. 6, device 600 may include following one or more components: processing component 602, memory 604, power supply Component 606, multimedia component 608, audio component 610, the interface 612 of input/output (I/O), sensor module 614, and Communication component 616.

The integrated operation of the usual control device 600 of processing component 602, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing component 602 may include that one or more processors 620 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 602 may include one or more modules, just Interaction between processing component 602 and other assemblies.For example, processing component 602 may include multi-media module, it is more to facilitate Interaction between media component 608 and processing component 602.

Memory 604 is configured as storing various types of data to support the operation in equipment 600.These data are shown Example includes the instruction of any application or method for operating on device 600, contact data, and telephone book data disappears Breath, picture, video etc..Memory 604 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.

Power supply module 606 provides electric power for the various assemblies of device 600.Power supply module 606 may include power management system System, one or more power supplys and other with for device 600 generate, manage, and distribute the associated component of electric power.

Multimedia component 608 includes the screen of one output interface of offer between described device 600 and user.One In a little embodiments, screen may include liquid crystal display (LAD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 608 includes a front camera and/or rear camera.When equipment 600 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.

Audio component 610 is configured as output and/or input audio signal.For example, audio component 610 includes a Mike Wind (MIA), when device 600 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched It is set to reception external audio signal.The received audio signal can be further stored in memory 604 or via communication set Part 616 is sent.In some embodiments, audio component 610 further includes a loudspeaker, is used for output audio signal.

I/O interface 612 provides interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.

Sensor module 614 includes one or more sensors, and the state for providing various aspects for device 600 is commented Estimate.Such as sensor module 614 can detecte the state that opens/closes of equipment 600, the relative positioning of component, such as described Component is the display and keypad of device 600, and sensor module 614 can be with 600 1 components of detection device 600 or device Position change, the existence or non-existence that user contacts with device 600,600 orientation of device or acceleration/deceleration and device 600 Temperature change.Sensor module 614 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 614 can also include optical sensor, such as AMOS or AAD imaging sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.

Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device 600 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.In an exemplary implementation In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 616 further includes near-field communication (NFA) module, to promote short range communication.Example Such as, NFA module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.

In the exemplary embodiment, device 600 can be believed by one or more application specific integrated circuit (ASIA), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.

Specifically, the embodiment of the invention provides a kind of equipment for generating virtual image, which can be specially device 600, it include that perhaps more than one program one of them or more than one program is stored in by memory 604 and one In memory 604, and be configured to by one or more than one processor 620 execute the one or more programs Include the instruction for performing the following operation:

Obtain the target image of image capture device acquisition；

Obtain the corresponding vivid role of the object；

The embodiment of the invention also provides a kind of non-transitorycomputer readable storage medium including instruction, for example including The memory 604 of instruction, above-metioned instruction can be executed by the processor 620 of device 600 to complete the above method.For example, described non- Provisional computer readable storage medium can be ROM, random access memory (RAM), AD-ROM, tape, floppy disk and light number According to storage equipment etc..

A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of electronic equipment When device executes, so that electronic equipment is able to carry out a kind of method for generating virtual image, which comprises

Obtain the target image of image capture device acquisition；

Obtain the corresponding vivid role of the object；

The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the principle of the present invention, it can also make several improvements and retouch, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims

1. a kind of method for generating virtual image, which is characterized in that the described method includes:

Obtain the target image of image capture device acquisition；

2. the method according to claim 1, wherein described from the target image, detection characterization object Action message include:

3. according to the method described in claim 2, it is characterized in that, the target image include an object when, institute It states from the target image detection and characterizes the limb action information of the object and include:

The target image is handled using preset first model, identifies preset artis in the target image Location information, using the location information of the preset artis as the limb action information for characterizing the object；It is described pre- If the first model obtained using preset convolutional neural networks algorithm, for characterizing preset joint in image and image The corresponding relationship of the location information of point.

4. according to the method described in claim 2, it is characterized in that, the target image includes at least two objects When, the limb action information that the detection from the target image characterizes the object includes:

The target image is handled using preset second model, identifies each preset joint in the target image The location information of point, preset second model are obtained using preset convolutional neural networks algorithm, are used for phenogram The corresponding relationship of the location information of preset artis in picture and image；

The target image is handled using preset third model, is determined each described preset in the target image Cohesion between artis, the preset third model are obtained using partial association field algorithm, are used for phenogram The corresponding relationship of cohesion in picture and image between preset artis；

According to the cohesion between each preset artis, the preset artis for belonging to each object is determined Location information, as the limb action information for characterizing the object.

5. according to method described in claim 2-4 any one, which is characterized in that described to generate void according to the action message Quasi- image, the movement of the virtual image and the object, which is consistent, includes:

Obtain the corresponding vivid role of the object；

According to the limb action information of the object, the limbs model data of the vivid role, generation and institute are determined State the virtual image that the limb action of object is consistent.

6. according to the method described in claim 5, it is characterized in that, the limb action information according to the object, really The limb action of the limbs model data of the fixed vivid role, generation and the object is consistent described virtual Image:

According to the limb action information of the object, the limbs with the object are determined from preset virtual image library The limbs model data of the corresponding vivid role of action message, the preset virtual image library includes the limbs of object The corresponding relationship of action message and the limbs model data of vivid role；

The limbs model data of the identified vivid role is combined, generates and is protected with the limb action of the object Hold the consistent virtual image.

7. method described in -6 any one according to claim 1, which is characterized in that the target image includes from target video It is described that virtual image, the virtual image and the mesh are generated according to the action message when several video images of middle acquisition The movement of mark object, which is consistent, includes:

According to the limb action information of object described in several described video images, the animation view of the virtual image is generated Frequently, the movement of virtual image described in the animated video and object described in the target video is consistent.

8. the method according to claim 1, wherein described from the target image, detection characterization object Action message include:

9. a kind of device for generating virtual image, which is characterized in that described device includes:

Generation module, for generating virtual image, the movement of the virtual image and the object according to the action message It is consistent.

10. device according to claim 9, which is characterized in that the detection module includes:

11. device according to claim 10, which is characterized in that when the target image includes an object, The detection unit includes:

First identification subelement identifies the target for handling using preset first model the target image The location information of preset artis in image, using the location information of the preset artis as the characterization object Limb action information；Preset first model is obtained using preset convolutional neural networks algorithm, is used for phenogram The corresponding relationship of the location information of preset artis in picture and image.

12. device according to claim 10, which is characterized in that the target image includes at least two objects When, the detection unit includes:

Second identification subelement identifies the target for handling using preset second model the target image The location information of each preset artis in image, preset second model are calculated using preset convolutional neural networks What method obtained, for characterizing the corresponding relationship of the location information of preset artis in image and image；

First determination unit determines the target figure for handling using preset third model the target image Cohesion as between each preset artis, the preset third model are using partial association field algorithm It obtains, for characterizing the corresponding relationship of the cohesion in image and image between preset artis；

Second determination unit, for according to the cohesion between each preset artis, determination to belong to each mesh The location information for marking the preset artis of object, as the limb action information for characterizing the object.

13. according to device described in claim 9-12 any one, which is characterized in that the generation module includes:

Acquiring unit, for obtaining the corresponding vivid role of the object；

Generation unit determines the limbs model of the vivid role for the limb action information according to the object Data generate the virtual image being consistent with the limb action of the object.

14. device according to claim 13, which is characterized in that the generation module, further includes:

Third determination unit is determined from preset virtual image library for the limb action information according to the object The limbs model data of the vivid role corresponding with the limb action information of the object, the preset virtual image Library includes the corresponding relationship of the limb action information of object and the limbs model data of vivid role；

Assembled unit generates and the target for the limbs model data of the identified vivid role to be combined The virtual image that the limb action of object is consistent.

15. according to device described in claim 9-14 any one, which is characterized in that the target image includes regarding from target When several video images obtained in frequency, the generation module is specifically used for:

16. device according to claim 9, which is characterized in that the detection module includes:

17. a kind of equipment for generating virtual image, which is characterized in that include memory and one or more than one Program, perhaps more than one program is stored in memory and is configured to by one or more than one processing for one of them It includes for executing method as claimed in any of claims 1 to 8 in one of claims that device, which executes the one or more programs,.

18. a kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of electronic equipment When device executes, so that electronic equipment is able to carry out method as claimed in any of claims 1 to 8 in one of claims.