CN109936774A - Virtual image control method, device and electronic equipment - Google Patents

Virtual image control method, device and electronic equipment Download PDF

Info

Publication number
CN109936774A
CN109936774A CN201910252787.3A CN201910252787A CN109936774A CN 109936774 A CN109936774 A CN 109936774A CN 201910252787 A CN201910252787 A CN 201910252787A CN 109936774 A CN109936774 A CN 109936774A
Authority
CN
China
Prior art keywords
interaction
main broadcaster
virtual image
training
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910252787.3A
Other languages
Chinese (zh)
Inventor
贾西亚
吴昊
徐子豪
李政
蓝永峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Information Technology Co Ltd
Original Assignee
Guangzhou Huya Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Information Technology Co Ltd filed Critical Guangzhou Huya Information Technology Co Ltd
Priority to CN201910252787.3A priority Critical patent/CN109936774A/en
Publication of CN109936774A publication Critical patent/CN109936774A/en
Priority to PCT/CN2020/081627 priority patent/WO2020200082A1/en
Priority to US17/598,733 priority patent/US20220103891A1/en
Priority to SG11202111323RA priority patent/SG11202111323RA/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the present application provides a kind of virtual image control method, device and electronic equipment, it is input to by the main broadcaster's video frame for acquiring video acquisition device in real time in interaction action recognition model trained in advance, when recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity, obtain preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement, then according to virtual image interactive content, the virtual image that control live streaming provides in the live streaming interface of terminal executes corresponding interaction movement, to generate the interdynamic video stream of virtual image, and interdynamic video stream is sent to by live streaming reception terminal by direct broadcast server and is played out.In this way, generating association by the way that the interaction content of the virtual image of main broadcaster to be interacted to movement with main broadcaster, the interaction effect during live streaming can be improved, reduce manual operation when main broadcaster initiates virtual image interactive, realize the automatic interaction of virtual image.

Description

Virtual image control method, device and electronic equipment
Technical field
This application involves internets, and field is broadcast live, in particular to a kind of virtual image control method, device and electronics Equipment.
Background technique
Field is broadcast live in internet, it is spectators that main broadcaster, which can carry out internet video live streaming by used electronic equipment, Programme televised live is provided, spectators can watch by electronic equipment and be broadcast live.It can also be into during live streaming, between main broadcaster and spectators Row interaction, however either main broadcaster or spectators, be broadcast live shown in interface be mostly main broadcaster oneself image, show form and Interaction mode is more single, affect experience.
Present inventor has found that part main broadcaster is that protection individual privacy is reluctant to appear in spectators face with reality image Before, but mask etc. is used to cover the face of main broadcaster, and be easy to cause main broadcaster and spectators' to interact inconvenience.
In order to enrich the interaction of live streaming showed between form and main broadcaster and spectators, during live streaming, in some realities It applies in mode, it can be in live streaming showing interface virtual image, to be interacted by the virtual image with spectators.However, the party Virtual image is only merely merely to demonstrate some interaction movement in case, it is difficult to be associated with, cause practical mutual with main broadcaster's generation movement It moves ineffective.
Summary of the invention
In view of this, the be designed to provide a kind of virtual image control method, device and electronics of the embodiment of the present application are set It is standby, to solve or improve the above problem.
According to the one aspect of the embodiment of the present application, a kind of electronic equipment is provided, may include that one or more storages are situated between Matter and one or more processors communicated with storage medium.One or more storage mediums are stored with the executable machine of processor Device executable instruction.When electronic equipment operation, processor executes the machine-executable instruction, to execute virtual image control Method.
According to the another aspect of the embodiment of the present application, a kind of virtual image control method is provided, is applied to live streaming and provides eventually End, which comprises
Main broadcaster's video frame that video acquisition device acquires in real time is input in interaction action recognition model trained in advance, Identify in main broadcaster's video frame whether act comprising main broadcaster's interaction;
When recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity, the preconfigured and master is obtained It broadcasts interaction and acts corresponding virtual image interactive content;
According to the virtual image interactive content, controls the virtual image that the live streaming is provided in the live streaming interface of terminal and hold The corresponding interaction movement of row, to generate the interdynamic video stream of the virtual image, and will be described mutual by the direct broadcast server Dynamic video stream receives terminal to the live streaming and plays out.
According to the another aspect of the embodiment of the present application, a kind of virtual image control device is provided, is applied to live streaming and provides eventually End, described device include:
Action recognition module is interacted, main broadcaster's video frame for acquiring video acquisition device in real time is input to preparatory training Interaction action recognition model in, identify in main broadcaster's video frame whether to act comprising main broadcaster's interaction;
Interaction content obtains module, and main broadcaster's interaction movement is recognized in main broadcaster's video frame of preset quantity for working as When, obtain preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement;
Video flowing generation module, for controlling the live streaming and providing the straight of terminal according to the virtual image interactive content It broadcasts the virtual image in interface and executes corresponding interaction movement, to generate the interdynamic video stream of the virtual image, and pass through institute It states direct broadcast server and the interdynamic video stream is sent to the live streaming receives terminal and play out.
According to the another aspect of the embodiment of the present application, a kind of readable storage medium storing program for executing is provided, is stored on the readable storage medium storing program for executing There is machine-executable instruction, the step of above-mentioned virtual image control method can be executed when which is run by processor Suddenly.
Based on any of the above-described aspect, the embodiment of the present application is defeated by the main broadcaster's video frame for acquiring video acquisition device in real time Enter into interaction action recognition model trained in advance, when recognizing in main broadcaster's video frame in preset quantity, main broadcaster's interaction is dynamic When making, preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement is obtained, it is then mutual according to virtual image Dynamic content, the virtual image that control live streaming provides in the live streaming interface of terminal executes corresponding interaction movement, to generate virtual shape The interdynamic video stream of elephant, and interdynamic video stream is sent to by live streaming reception terminal by direct broadcast server and is played out.In this way, logical It crosses and the interaction content of the virtual image of main broadcaster is interacted to movement generation association with main broadcaster, the interaction effect during live streaming can be improved Fruit reduces manual operation when main broadcaster initiates virtual image interactive, realizes the automatic interaction of virtual image.
To enable the above objects, features, and advantages of the embodiment of the present application to be clearer and more comprehensible, below in conjunction with embodiment, and Cooperate appended attached drawing, elaborates.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the application scenarios schematic block diagram of live broadcast system provided by the embodiment of the present application;
Fig. 2 shows the flow diagrams of virtual image control method provided by the embodiment of the present application;
Fig. 3 shows the schematic network structure of neural network model provided by the embodiment of the present application;
Fig. 4 shows live streaming provided by the embodiment of the present application and provides the live streaming interface schematic diagram of terminal;
Fig. 5 shows the training flow diagram of neural network model provided by the embodiment of the present application;
Fig. 6 shows the stream for each sub-steps that step S110 shown in Fig. 2 provided by the embodiment of the present application includes Journey schematic diagram;
Fig. 7 shows shown in FIG. 1 be broadcast live provided by the embodiment of the present application and provides the example components schematic diagram of terminal.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it should be understood that attached drawing in the application The purpose of illustration and description is only played, is not used to limit the protection scope of the application.In addition, it will be appreciated that schematical attached Figure does not press scale.Process used herein shows real according to some embodiments of the embodiment of the present application Existing operation.It should be understood that the operation of flow chart can be realized out of order, the step of context relation of logic can be with Reversal order is implemented simultaneously.In addition, those skilled in the art under the guide of teachings herein, can add to flow chart Other one or more operations, can also remove one or more operations from flow chart.
In addition, described embodiments are only a part of embodiments of the present application, instead of all the embodiments.Usually exist The component of the embodiment of the present application described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, is not intended to limit claimed the application's to the detailed description of the embodiments herein provided in the accompanying drawings below Range, but it is merely representative of the selected embodiment of the application.Based on embodiments herein, those skilled in the art are not being done Every other embodiment obtained under the premise of creative work out, shall fall in the protection scope of this application.
Fig. 1 is the application scenarios schematic diagram of live broadcast system 10 provided by the embodiments of the present application.For example, live broadcast system 10 can be with It is the service platform for such as internet live streaming etc.Shown in referring to Fig.1, live broadcast system 10 may include direct broadcast server 200, live streaming provide terminal 100 and live streaming receive terminal 300, direct broadcast server 200 respectively with live streaming provide terminal 100 and It is broadcast live and receives the communication connection of terminal 300, provide direct broadcast service for providing terminal 100 for live streaming and reception terminal 300 being broadcast live. For example, live streaming, which provides terminal 100, can be sent to the live video stream of direct broadcasting room direct broadcast server 200, spectators can pass through Live streaming receives terminal 300 and pulls live video stream from direct broadcast server 200 to watch the live video of direct broadcasting room.In another example main Broadcasting server can also send a notification message when the direct broadcasting room that spectators subscribe to starts broadcasting to the live streaming of spectators reception terminal 300. Live video stream can be the complete video stream that the video flowing being broadcast live in platform is currently being broadcast live or is being formed after the completion of live streaming.
It is appreciated that live broadcast system 10 shown in FIG. 1 is only a kind of feasible example, in other feasible embodiments, The live broadcast system 10 can also only include a portion of component part shown in Fig. 1 or can also include other compositions portion Point.
In some implement scenes, live streaming provides terminal 100 and live streaming receives terminal 300 and may be used interchangeably.For example, straight The main broadcaster for broadcasting offer terminal 100 can be used live streaming and provide terminal 100 to provide live video service for spectators, or as sight Crowd checks the live video that other main broadcasters provide.It is received eventually in another example live streaming also can be used in the spectators that live streaming receives terminal 300 The live video that 300 viewing of end main broadcaster of interest provides, or live video service is provided as main broadcaster for other spectators.
In the present embodiment, live streaming provides terminal 100 and live streaming receives terminal 300 and may be, but not limited to, smart phone, a Personal digital assistant, tablet computer, personal computer, laptop, virtual reality terminal device, augmented reality terminal device Deng.Wherein, live streaming, which provides, can install in terminal 100 and live streaming reception terminal 300 for providing the mutual of internet direct broadcast service Networked product, for example, internet product can be it is relevant to internet direct broadcast service used in computer or smart phone Application APP, Web page, small routine etc..
In the present embodiment, live broadcast system 10 can also include the video acquisition device for acquiring main broadcaster's video frame of main broadcaster 400, video acquisition device 400 is mounted directly or is integrated in live streaming and provides terminal 100, can also provide terminal independently of live streaming 100 and with live streaming provide terminal 100 connect.
Fig. 2 shows the flow diagram of virtual image control method provided by the embodiments of the present application, the virtual image controls Offer terminal 100 can be broadcast live as shown in Fig. 1 and execute for method processed.It should be appreciated that in other embodiments, the void of the present embodiment The sequence of quasi- image control method part step can be exchanged with each other according to actual needs or part steps therein It can be omitted or delete.The detailed step of the virtual image control method is described below.
Main broadcaster's video frame that video acquisition device 400 acquires in real time is input to interaction trained in advance and moved by step S110 Make in identification model, identifies in main broadcaster's video frame whether act comprising main broadcaster's interaction.
In the present embodiment, whether interaction action recognition model trained in advance includes for identification main broadcaster in main broadcaster's video frame Interaction movement and specifically which kind of main broadcaster's interaction movement.Wherein, which can be based on neural network mould Type training obtains, for example, Yolov2 network model.
As a kind of possible embodiment, referring to Fig. 3, above-mentioned interaction action recognition model may include input layer, extremely A few convolution extract layer, full articulamentum and classification layer.Each convolution extract layer include first convolutional layer set gradually, Depth convolutional layer and second point convolutional layer etc. are multiple according to first convolutional layer, depth convolutional layer and second point convolutional layer The convolutional layer of sequence setting.One activation primitive layer and pond layer are set after each convolutional layer in the convolution extract layer, entirely Articulamentum is located at after the last one pond layer, and classification layer is located at after full articulamentum.Next about the interaction action recognition The training process of model can be illustrated later, wouldn't be introduced herein.
Step S120, when recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity, acquisition is matched in advance The virtual image interactive content corresponding with main broadcaster's interaction movement set.
It, can be in main broadcaster's video frame of preset quantity in order to avoid the misrecognition of main broadcaster's interaction movement in the present embodiment When recognizing main broadcaster's interaction movement, preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement is obtained.
Wherein, live streaming provides in terminal 100 and is previously stored with default interaction content library, and it includes preparatory for presetting interaction content library Each main broadcaster's interaction of configuration acts corresponding virtual image interactive content, and virtual image interactive content may include dialogue interaction The combination of one or more of content, special efficacy interaction content and limbs interaction content.Optionally, live streaming provides terminal 100 Default interaction content library can be locally configured, default interaction content library, this implementation can also be downloaded from direct broadcast server 200 Example is not specifically limited this.
Step S130, according to virtual image interactive content, control live streaming provides the virtual shape in the live streaming interface of terminal 100 As executing corresponding interaction movement, to generate the interdynamic video stream of virtual image, and pass through direct broadcast server 200 for interdynamic video Stream is sent to live streaming reception terminal 300 and plays out.
It is straight at this only as an example, providing a kind of live streaming examples of interfaces figure of terminal 100 referring to Fig. 4, showing and being broadcast live It broadcasts in interface, may include live streaming interface display frame, main broadcaster's video frame display box and virtual image region.Wherein, boundary is broadcast live Face display box is used to show the complete video stream that the video flowing being broadcast live in platform is currently being broadcast live or is being formed after the completion of live streaming, Main broadcaster's video frame display box for showing the collected main broadcaster's video frame in real time of video acquisition device 400, use by virtual image region In the virtual image for showing main broadcaster.
When main broadcaster initiates main broadcaster's interaction movement, the initiation main broadcaster interaction of main broadcaster can be shown in main broadcaster's video frame display box Movement, at the same it is available to the corresponding virtual image interactive content of main broadcaster's interaction movement, then control in virtual image region Virtual image execute corresponding interaction movement.For example, if the main broadcaster's interaction movement recognized is warmth movement of the hand than love, It then can control the warmth movement that virtual image executes corresponding hand than love at this time, and show dialogue interaction content " than the heart " And the special efficacy of " liking you ".It is possible thereby to generate the interdynamic video stream of virtual image, and will be interacted by direct broadcast server 200 Video stream receives terminal 300 to live streaming and plays out.
In this way, the present embodiment generates association by the way that the interaction content of the virtual image of main broadcaster to be interacted to movement with main broadcaster, it can To improve the interaction effect during live streaming, manual operation when main broadcaster initiates virtual image interactive is reduced, realizes virtual image Automatic interaction.
The process for obtaining interaction action recognition model to the training of aforementioned neurological network model below is described in detail.
Firstly, establishing neural network model.Optionally, which can use, but be not limited to Yolov2 net Network model.
Then, pre-training is carried out to neural network model using public data collection, obtains pre-training neural network model.Its In, public data collection can use COCO data set, and COCO data set be one large-scale image data set, aim at object detection, Segmentation, human body critical point detection, semantic segmentation and subtitle are generated and are designed, and are mainly intercepted from complicated everyday scenes, image In detection target the calibration of position is carried out by accurately segmentation so that there is neural network model preliminary target to examine It surveys, the context relation identification between target, pinpoint ability of the target in two dimension.
Then, training is iterated to pre-training neural network model using collection data set, obtains interaction action recognition Model.
Wherein, collecting data set includes the training sample image collection for being marked with the realistic objective of different main broadcaster's interaction movements, Realistic objective is that main broadcaster's interaction acts the actual image area in training sample image.For example, collection data set may include But the different main broadcaster's interactions for being not limited to acquire during live streaming act corresponding main broadcaster's image or main broadcaster do-it-yourself difference main broadcaster The image etc. uploaded after interaction movement.Main broadcaster's interaction movement may include the common interaction movement during live streaming, such as scissors Hand sells work of sprouting, hand than warm movement of love etc., and the present embodiment is not especially limited this.
Optionally, in order to enable interaction action recognition model can identify that main broadcaster's interaction acts under various circumstances, this reality The image parameter of each training sample image can be concentrated with adjusting training sample image by applying example, with to training sample image collection into The extension of row sample.For example, in order to adapt to main broadcaster's ring separated by different distances between video acquisition device 400 during live streaming Border, the equal proportion that a variety of different proportions can be carried out to initial collection data set is cut, to obtain and initial collection data set Relevant equal proportion cut data collection.In another example in order to adapt to the living broadcast environment being broadcast live under different light intensities, it can be to first Begin to collect data set degree of being exposed adjustment processing, to obtain exposure adjustment data relevant to initial collection data set Collection.In another example can also be added not to initial collection data set to adapt to the living broadcast environment being broadcast live under different noise environment With the noise of degree, to obtain noise data collection relevant to initial collection data set.In this way, by training sample image Collection carries out sample extension, can effectively improve recognition capability of the subsequent interaction action recognition model under different live scenes.
Since the identification process of entirely interaction movement all occurs to provide terminal 100 in live streaming, mentioned in order to which live streaming is effectively reduced For the calculation amount of terminal 100, recognition speed is improved, is designed by above-mentioned network structure, each convolution extract layer is using separable Convolutional coding structure is made of the cascade structure of first convolutional layer, depth convolutional layer and second point convolutional layer, using this grade It is coupled structure compared with using the structure of three common convolutional layers, calculation amount and network parameter amount are smaller.
Below with reference to neural network model shown in Fig. 3, data set is collected to pre-training neural network mould to aforementioned use Type is iterated trained process and illustrates, referring to Fig. 5, further including step S101, step before step S110 Rapid S102, step S103, step S104, step S105, step S106 and step S107, separately below to step S101, step Rapid S102, step S103, step S104, step S105, step S106 and step S107 are introduced.
Each training sample image that training sample image is concentrated is input to pre-training neural network model by step S101 Input layer pre-processed, obtain pretreatment image.In detail, it needs to be instructed using stochastic gradient descent method due to subsequent Practice, therefore each training sample image needs inputted are standardized.
It in detail can be by each training sample image for example, each training sample image can be carried out equalization The all centralizations of each dimension average to obtain maenvalue again to 0 after all training sample image summations, then will be all Training sample image subtracts this maenvalue, obtains pretreatment image.
In another example the data amplitude of each training sample image can also be normalized to same range, such as For each feature, range is [- 1,1], to obtain pretreatment image.
In another example PCA dimensionality reduction can also be carried out each training sample image, the degree of correlation of each dimension is allowed to cancel, it is special It is independent from each other between feature of seeking peace, then the amplitude normalization to each training sample image on each feature axis again, Obtain pretreatment image.
Step S102 passes through first convolutional layer, the depth convolutional layer of the convolution extract layer for each convolution extract layer And second point convolutional layer extracts the multidimensional characteristic image of pretreatment image respectively, and the multidimensional characteristic image that extraction is obtained is defeated Enter and carry out Nonlinear Mapping into the activation primitive layer connected, is then input to the multidimensional characteristic image after Nonlinear Mapping In the pond layer connected carry out pond processing, and by the pond characteristic pattern that pond is handled be input to next layer of convolutional layer into Row feature extraction.
The function of first convolutional layer, depth convolutional layer and second point convolutional layer is to carry out spy to the image data of input Sign is extracted, and internal includes multiple convolution kernels, and each element for forming convolution kernel corresponds to a weight coefficient and a deviation Amount namely a neuron.For the multidimensional characteristic image of each pretreatment image, there is a property to be referred to as local association Matter, the pixel of a pretreatment image influence it is maximum be the pretreatment image periphery pixel, and with apart from this picture The distant pixel of vegetarian refreshments relationship between the two is little.In this way, each neuron only needs and the connection of upper one layer of part, phase When scanning a zonule in each neuron, then many neurons (these neuron weights are shared) are equivalent to altogether Global characteristic pattern is scanned, thus constitutes an one-dimensional characteristic figure, multidimensional characteristic image namely is extracted this pretreatment The multidimensional characteristic of image obtains.
On this basis, will extract obtained multidimensional characteristic image be input in connected activation primitive layer carry out it is non-thread Property mapping, with assist expression multidimensional characteristic image in complex characteristic.Optionally, activation primitive layer can use but be not limited to line Property rectification unit (Rectified Linear Unit, ReLU), Sigmoid function and hyperbolic tangent function (Hyperbolic Tangent) etc..
Multidimensional characteristic image after Nonlinear Mapping is then input to progress pond processing in connected pond layer, That is, the multidimensional characteristic image after Nonlinear Mapping, which can be passed to pond layer, carries out feature selecting and information filtering, pond layer can With comprising presetting pond function, so that the result of the multidimensional characteristic image a single point after Nonlinear Mapping is replaced with its phase The characteristic pattern statistic in neighbouring region.Then, by the pond characteristic pattern that pond is handled be input to next layer of convolutional layer continue into Row feature extraction.
The pond characteristic pattern that the last layer pond layer exports is input to full articulamentum by step S103, obtains connecting entirely special Levy output valve.In detail, all neurons all have the right to reconnect in full articulamentum, when all convolutional layers (namely of front Some convolutional layers, depth convolutional layer and second point convolutional layer) extract the characteristic image for being sufficient to identify image to be processed Afterwards, it next needs to classify by full articulamentum, obtains full connection features output valve.
Full connection features output valve is input in classification layer and carries out prediction target classification, obtains each instruction by step S104 Practice the prediction target of sample image.
Step S105 calculates the loss function (Loss between the prediction target of each training sample image and realistic objective Function) value.
Step S106 carries out backpropagation training according to loss function value, and calculates the net of pre-training neural network model The gradient of network parameter.
Optionally, in the present embodiment, interaction action recognition model can also include that multiple residual error network layers are (not shown Out), two layers of output par, c of each residual error network layer for that will interact arbitrary neighborhood in action recognition model is adjacent with this The importation of two layers of later layer concatenates.Thus, it is possible to which it is different anti-to can choose gradient in backpropagation training To propagation path, enhance training effect.
In detail, after determining loss function value, the reversed of backpropagation training can be determined according to loss function value Then propagation path selects concatenation section corresponding with reverse travel path by the residual error network layer of pre-training neural network model Point carries out backpropagation training, and in the corresponding tandem node of arrival reverse travel path, calculates pre-training neural network mould The gradient of the network parameter of type.
Step S107 updates pre-training neural network model using stochastic gradient descent method according to the gradient being calculated Network parameter after continue to train, when pre-training neural network model meets training termination condition, output training is obtained Interact action recognition model.
Wherein, above-mentioned training termination condition may include at least one of the following conditions:
1) repetitive exercise number reaches setting number;2) loss function value is lower than given threshold;3) loss function value is no longer Decline.
Wherein, in condition 1) in, in order to save operand, the maximum value of the number of iterations can be set, if the number of iterations Reach setting number, the iteration of this iteration cycle can be stopped, using the pre-training neural network model finally obtained as interaction Action recognition model.In condition 2) in, if loss function value is lower than given threshold, illustrate current interaction action recognition model It can satisfy condition substantially, iteration can be stopped at this time.In condition 3) in, loss function value no longer declines, and shows Optimal interaction action recognition model is formd, iteration can be stopped.
It should be noted that above-mentioned iteration stopping condition can be used in combination, a use can also be selected, for example, can be Loss function value, which no longer declines, stops iteration, alternatively, stopping iteration when the number of iterations reaches setting number, alternatively, losing Functional value stops iteration when no longer declining.Alternatively, given threshold can also be lower than in loss function value, and loss function value is not When declining again, stop iteration.
In addition, in the actual implementation process, can also be not limited to using above-mentioned example as training termination condition, this field Technical staff can design the training termination condition different from above-mentioned example according to actual needs.
Based on foregoing description, in step s 110, the present embodiment by by main broadcaster's video frame be input to training obtain it is mutual In dynamic action recognition model, recognition result figure can be obtained, and main broadcaster's video is determined according to the recognition result figure of main broadcaster's video frame Whether acted comprising main broadcaster's interaction in frame.
It wherein, include at least one target frame in recognition result figure, target frame is that the main broadcaster in marker recognition result figure is mutual The geometry frame of movement.It is input in the interaction action recognition model that training obtains, is identified to by main broadcaster's video frame below The specific identification process of result figure is described in detail.Referring to Fig. 6, step S110 may include following sub-step:
Main broadcaster's video frame is divided into multiple grids by interaction action recognition model by sub-step S111.
Sub-step S112, for each grid, can be generated multiple to adapt to the diversity of live scene in the grid Different attribute joins geometry of numbers prediction block, wherein the corresponding reference frame of each geometry prediction block, the category of each geometry prediction block Property parameter includes the center point coordinate relative to reference frame, width, height and classification.
Sub-step S113 calculates the confidence score of each geometry prediction block, and rejects confidence level according to calculated result and obtain Divide the geometry prediction block lower than default score threshold.
For example, each geometry prediction block can be directed to, judge to interact in the region of the geometry prediction block with the presence or absence of main broadcaster Movement: main broadcaster's interaction acts if it does not exist, then determines that the confidence score of the geometry prediction block is 0;Main broadcaster's interaction is dynamic if it exists Make, then the region for calculating the geometry prediction block belongs to the posterior probability of main broadcaster's interaction movement, and calculates the inspection of the geometry prediction block Assessment valence functional value, wherein detection evaluation function value is used to characterize intersection and master of main broadcaster's interaction movement with the geometry prediction block Broadcast the ratio between interaction movement and the union of the geometry prediction block.Finally, can be according to posterior probability and detection evaluation function The product of value obtains the confidence score of the geometry prediction block.
On this basis, a default score threshold can be preset, if the confidence score of the geometry prediction block is low In the geometry prediction block for presetting score threshold, indicate that the target in the geometry prediction block is unlikely to be the pre- of living broadcast interactive movement Target is surveyed, if the confidence score of the geometry prediction block is more than or equal to the geometry prediction block of the default score threshold, indicates that this is several Target in what prediction block is likely to be the prediction target of living broadcast interactive movement.Thus, it is possible to which it is low to reject all confidence scores The geometry prediction block of score threshold is preset in this, to disposably reject the target that can not largely have living broadcast interactive movement Geometry prediction block, subsequent processing only is carried out to the geometry prediction block for the target that there may be living broadcast interactive movement, thus Greatly reduce subsequent calculation amount, further increases recognition speed.
Sub-step S114 arranges geometry frame remaining in the grid according to the descending sequence of confidence score Sequence, and the maximum geometry frame of confidence score is determined as target frame according to ranking results, to obtain recognition result figure.
The recognition result figure for passing through live video as a result, is marked with the target frame of main broadcaster's interaction movement if it exists, it is determined that It is acted in main broadcaster's video frame comprising main broadcaster's interaction, and can determine the interaction type of action of main broadcaster's interaction movement.
Fig. 7 shows live streaming shown in Fig. 1 provided by the embodiments of the present application and provides the example components signal of terminal 100 Figure, it may include storage medium 110, processor 120 and virtual image control device 500 that live streaming, which provides terminal 100,.The present embodiment In, storage medium 110 is respectively positioned in live streaming offer terminal 100 with processor 120 and the two is separately positioned.It is, however, to be understood that , storage medium 110 is also possible to provide except terminal 100 independently of live streaming, and can be connect by processor 120 by bus It mouthful accesses.Alternatively, storage medium 110 is also desirably integrated into processor 120, for example, it may be cache and/or General register.
Virtual image control device 500 can be understood as above-mentioned live streaming and provide terminal 100, or live streaming provides terminal 100 Processor 120, it is understood that provide end in live streaming except terminal 100 or processor 120 to provide independently of above-mentioned live streaming The lower software function module for realizing above-mentioned virtual image control method of 100 control of end.As shown in fig. 7, virtual image control dress Setting 500 may include interacting action recognition module 510, interaction content acquisition module 520 and video flowing generation module 530, under Face is respectively described in detail the function of each functional module of the virtual image control device 500.
Action recognition module 510 is interacted, main broadcaster's video frame for acquiring video acquisition device 400 in real time is input to pre- First in trained interaction action recognition model, identify in main broadcaster's video frame whether act comprising main broadcaster's interaction.It is appreciated that this is mutually Dynamic action recognition module 510 can be used for executing above-mentioned steps S110, the detailed realization about the interaction action recognition module 510 Mode is referred to above-mentioned to the related content of step S110.
Interaction content obtains module 520, for dynamic when recognizing main broadcaster's interaction in main broadcaster's video frame of preset quantity When making, preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement is obtained.It is appreciated that the interaction content Obtaining module 520 can be used for executing above-mentioned steps S120, and the detailed implementation for obtaining module 520 about the interaction content can With referring to above-mentioned to the related content of step S120.
Video flowing generation module 530, for according to virtual image interactive content, control live streaming to provide the live streaming of terminal 100 Virtual image in interface executes corresponding interaction movement, to generate the interdynamic video stream of virtual image, and passes through direct broadcast service Interdynamic video stream is sent to live streaming reception terminal 300 and played out by device 200.It is appreciated that the video flowing generation module 530 can With for executing above-mentioned steps S130, the detailed implementation about the video flowing generation module 530 is referred to above-mentioned to step The related content of rapid S130.
Further, the embodiment of the present application also provides a kind of computer readable storage medium, computer readable storage medium It is stored with machine-executable instruction, machine-executable instruction, which is performed, realizes virtual image controlling party provided by the above embodiment Method.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the module, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple module or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or module it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, the application Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the application State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, ROM, RAM, magnetic or disk Etc. the various media that can store program code.
The above is only the protection scopes of the specific embodiment of the application, but the application to be not limited thereto, any to be familiar with Those skilled in the art within the technical scope of the present application, can easily think of the change or the replacement, and should all cover Within the protection scope of the application.Therefore, the protection scope of the application should be subject to the protection scope in claims.

Claims (14)

1. a kind of virtual image control method, which is characterized in that be applied to live streaming and provide terminal, which comprises
Main broadcaster's video frame that video acquisition device acquires in real time is input in interaction action recognition model trained in advance, identification Whether acted comprising main broadcaster's interaction in main broadcaster's video frame;
When recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity, obtain preconfigured mutual with the main broadcaster The corresponding virtual image interactive content of movement;
According to the virtual image interactive content, controls the live streaming and the virtual image execution pair of terminal being broadcast live in interface is provided The interaction movement answered to generate the interdynamic video stream of the virtual image, and passes through direct broadcast server for the interdynamic video stream Live streaming reception terminal is sent to play out.
2. virtual image control method according to claim 1, which is characterized in that the interaction action recognition model includes Input layer, at least one convolution extract layer, full articulamentum and classification layer, each convolution extract layer includes first set gradually Convolutional layer, depth convolutional layer and second point convolutional layer are put, is arranged one after each convolutional layer in the convolution extract layer Activation primitive layer and pond layer, the full articulamentum are located at after the last one pond layer, and the classification layer is located at full articulamentum Later.
3. virtual image control method according to claim 2, which is characterized in that the interaction action recognition model is also wrapped Multiple residual error network layers are included, each residual error network layer is used for the defeated of two layers for interacting arbitrary neighborhood in action recognition model The importation concatenation of part two layers of the later layer adjacent with this out.
4. virtual image control method described in any one of -3 according to claim 1, which is characterized in that the method is also wrapped The step of training the interaction action recognition model in advance is included, is specifically included:
Establish neural network model;
Pre-training is carried out to the neural network model using public data collection, obtains pre-training neural network model;
Training is iterated to the pre-training neural network model using data set is collected, obtains the interaction action recognition mould Type, wherein the data set of collecting includes the training sample image collection for being marked with the realistic objective of different main broadcaster's interaction movements, institute Stating realistic objective is that main broadcaster's interaction acts the actual image area in training sample image.
5. virtual image control method according to claim 4, which is characterized in that described use collects data set to described The step of pre-training neural network model is iterated training, obtains the interaction action recognition model, comprising:
Each training sample image that the training sample image is concentrated is input to the defeated of the pre-training neural network model Enter layer to be pre-processed, obtains pretreatment image;
For each convolution extract layer of the pre-training neural network model, pass through first convolution of the convolution extract layer Layer, depth convolutional layer and second point convolutional layer extract the multidimensional characteristic image of pretreatment image respectively, and extraction is obtained Multidimensional characteristic image is input in connected activation primitive layer and carries out Nonlinear Mapping, then by the multidimensional after Nonlinear Mapping Characteristic image is input to progress pond processing in connected pond layer, and the pond characteristic pattern that pond is handled is input to Next layer of convolutional layer carries out feature extraction;
The pond characteristic pattern that the last layer pond layer exports is input to full articulamentum, obtains full connection features output valve;
The full connection features output valve is input in classification layer and carries out prediction target classification, obtains each training sample image Prediction target;
Calculate the loss function value between the prediction target of each training sample image and realistic objective;
Backpropagation training is carried out according to the loss function value, and calculates the network parameter of the pre-training neural network model Gradient;
According to the gradient being calculated, the network of the pre-training neural network model is updated using stochastic gradient descent method Continue to train after parameter, when the pre-training neural network model meets training termination condition, output training obtains mutual Dynamic action recognition model.
6. virtual image control method according to claim 5, which is characterized in that it is described according to the loss function value into Row backpropagation training, and the step of calculating the gradient of network parameter of the pre-training neural network model, comprising:
The reverse travel path of backpropagation training is determined according to the loss function value;
Concatenation section corresponding with the reverse travel path is selected by the residual error network layer of the pre-training neural network model Point carries out backpropagation training, and in the corresponding tandem node of the arrival reverse travel path, calculates the pre-training mind The gradient of network parameter through network model.
7. virtual image control method according to claim 4, which is characterized in that collect data set to described pre- using Before the step of training neural network model is iterated training, obtains the interaction action recognition model, the method is also wrapped It includes:
The image parameter that the training sample image concentrates each training sample image is adjusted, to the training sample image collection Carry out sample extension.
8. virtual image control method described in any one of -3 according to claim 1, which is characterized in that described to adopt video Main broadcaster's video frame that acquisition means acquire in real time is input in interaction action recognition model trained in advance, identifies main broadcaster's video The step of whether including main broadcaster's interaction movement in frame, comprising:
Main broadcaster's video frame is input in the interaction action recognition model, obtains recognition result figure, wherein the identification It include at least one target frame in result figure, the target frame is the several of main broadcaster's interaction movement in the label recognition result figure What frame;
It whether is determined in main broadcaster's video frame according to the recognition result figure of main broadcaster's video frame comprising main broadcaster's interaction movement.
9. virtual image control method according to claim 8, which is characterized in that described to input main broadcaster's video frame Into the interaction action recognition model, the step of obtaining recognition result figure, comprising:
Main broadcaster's video frame is divided into multiple grids by the interaction action recognition model;
For each grid, the geometry prediction block of multiple and different property parameters is generated in the grid, wherein each geometry prediction Frame corresponds to a reference frame, and the property parameters of each geometry prediction block include center point coordinate, the width relative to the reference frame Degree, height and classification;
The confidence score of each geometry prediction block is calculated, and confidence score is rejected according to calculated result and is lower than default score threshold The geometry prediction block of value;
Geometry frame remaining in the grid is ranked up according to confidence score descending sequence, and according to ranking results The maximum geometry frame of confidence score is determined as the target frame, to obtain recognition result figure.
10. virtual image control method according to claim 9, which is characterized in that described to calculate each geometry prediction block Confidence score the step of, comprising:
For each geometry prediction block, judge to act in the region of the geometry prediction block with the presence or absence of main broadcaster's interaction;
Main broadcaster's interaction acts if it does not exist, then determines that the confidence score of the geometry prediction block is 0;
Main broadcaster's interaction acts if it exists, then the region for calculating the geometry prediction block belongs to the posterior probability of main broadcaster's interaction movement, and Calculate the detection evaluation function value of the geometry prediction block, wherein the detection evaluation function value is for characterizing main broadcaster's interaction movement The ratio between movement and the union of the geometry prediction block is interacted with the intersection of the geometry prediction block with main broadcaster;
The confidence score of the geometry prediction block is obtained according to the posterior probability and the detection evaluation function value.
11. virtual image control method according to claim 1, which is characterized in that the live streaming provides in terminal in advance It is stored with default interaction content library, the default interaction content library includes that preconfigured each main broadcaster's interaction acts corresponding void Quasi- image interactive content, the virtual image interactive content include dialogue interaction content, special efficacy interaction content and limbs interaction The combination of one or more of content;
When recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity, obtain preconfigured mutual with the main broadcaster The step of acting corresponding virtual image interactive content, comprising:
It is acted according to main broadcaster's interaction of identification, is obtained in corresponding virtual image interactive from the default interaction content library Hold.
12. a kind of virtual image control device, which is characterized in that be applied to live streaming and provide terminal, described device includes:
Action recognition module is interacted, main broadcaster's video frame for acquiring video acquisition device in real time is input to the mutual of training in advance In dynamic action recognition model, identify in main broadcaster's video frame whether act comprising main broadcaster's interaction;
Interaction content obtains module, for obtaining when recognizing main broadcaster's interaction movement in main broadcaster's video frame in preset quantity Take preconfigured virtual image interactive content corresponding with main broadcaster's interaction movement;
Video flowing generation module, for controlling the live streaming and providing live streaming circle of terminal according to the virtual image interactive content Virtual image in face executes corresponding interaction movement, to generate the interdynamic video stream of the virtual image, and passes through live streaming clothes The interdynamic video stream is sent to live streaming reception terminal and played out by business device.
13. a kind of electronic equipment, which is characterized in that logical including one or more storage mediums and one or more and storage medium The processor of letter, one or more storage mediums are stored with the executable machine-executable instruction of processor, when electronic equipment is transported When row, processor executes the machine-executable instruction, requires virtual image described in any one of 1-11 with perform claim Control method.
14. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has machine that can hold Row instruction, the machine-executable instruction, which is performed, realizes virtual image control described in any one of claim 1-11 Method.
CN201910252787.3A 2019-03-29 2019-03-29 Virtual image control method, device and electronic equipment Pending CN109936774A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201910252787.3A CN109936774A (en) 2019-03-29 2019-03-29 Virtual image control method, device and electronic equipment
PCT/CN2020/081627 WO2020200082A1 (en) 2019-03-29 2020-03-27 Live broadcast interaction method and apparatus, live broadcast system and electronic device
US17/598,733 US20220103891A1 (en) 2019-03-29 2020-03-27 Live broadcast interaction method and apparatus, live broadcast system and electronic device
SG11202111323RA SG11202111323RA (en) 2019-03-29 2020-03-27 Live broadcast interaction method and apparatus, live broadcast system and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910252787.3A CN109936774A (en) 2019-03-29 2019-03-29 Virtual image control method, device and electronic equipment

Publications (1)

Publication Number Publication Date
CN109936774A true CN109936774A (en) 2019-06-25

Family

ID=66988795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910252787.3A Pending CN109936774A (en) 2019-03-29 2019-03-29 Virtual image control method, device and electronic equipment

Country Status (1)

Country Link
CN (1) CN109936774A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110662083A (en) * 2019-09-30 2020-01-07 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN110688008A (en) * 2019-09-27 2020-01-14 贵州小爱机器人科技有限公司 Virtual image interaction method and device
CN111028339A (en) * 2019-12-06 2020-04-17 国网浙江省电力有限公司培训中心 Behavior action modeling method and device, electronic equipment and storage medium
CN111325851A (en) * 2020-02-28 2020-06-23 腾讯科技(深圳)有限公司 Image processing method and device, electronic equipment and computer readable storage medium
CN111383313A (en) * 2020-03-31 2020-07-07 歌尔股份有限公司 Virtual model rendering method, device and equipment and readable storage medium
WO2020200082A1 (en) * 2019-03-29 2020-10-08 广州虎牙信息科技有限公司 Live broadcast interaction method and apparatus, live broadcast system and electronic device
CN111918073A (en) * 2020-06-30 2020-11-10 北京百度网讯科技有限公司 Management method and device of live broadcast room
CN111970570A (en) * 2020-07-17 2020-11-20 北京奇艺世纪科技有限公司 Method and device for prompting video content interaction position
WO2021147480A1 (en) * 2020-01-22 2021-07-29 北京达佳互联信息技术有限公司 Live broadcast assistance method and electronic device
CN113242440A (en) * 2021-04-30 2021-08-10 广州虎牙科技有限公司 Live broadcast method, client, system, computer equipment and storage medium
CN113435431A (en) * 2021-08-27 2021-09-24 北京市商汤科技开发有限公司 Posture detection method, training device and training equipment of neural network model
CN113539218A (en) * 2020-04-16 2021-10-22 福建凯米网络科技有限公司 Real-time interaction method and terminal for virtual image
CN114527877A (en) * 2022-02-22 2022-05-24 广州虎牙科技有限公司 Virtual image driving method and device and server
CN114793286A (en) * 2021-01-25 2022-07-26 上海哔哩哔哩科技有限公司 Video editing method and system based on virtual image
WO2023279704A1 (en) * 2021-07-07 2023-01-12 上海商汤智能科技有限公司 Live broadcast method and apparatus, and computer device, storage medium and program
WO2023279713A1 (en) * 2021-07-07 2023-01-12 上海商汤智能科技有限公司 Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product
CN115661942A (en) * 2022-12-15 2023-01-31 广州卓远虚拟现实科技有限公司 Action data processing method and system based on virtual reality and cloud platform
WO2023016167A1 (en) * 2021-08-09 2023-02-16 惠州Tcl云创科技有限公司 Virtual image video call method, terminal device, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407889A (en) * 2016-08-26 2017-02-15 上海交通大学 Video human body interaction motion identification method based on optical flow graph depth learning model
CN106804007A (en) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 The method of Auto-matching special efficacy, system and equipment in a kind of network direct broadcasting
CN107423721A (en) * 2017-08-08 2017-12-01 珠海习悦信息技术有限公司 Interactive action detection method, device, storage medium and processor
CN107613310A (en) * 2017-09-08 2018-01-19 广州华多网络科技有限公司 A kind of live broadcasting method, device and electronic equipment
CN107750014A (en) * 2017-09-25 2018-03-02 迈吉客科技(北京)有限公司 One kind connects wheat live broadcasting method and system
CN108960185A (en) * 2018-07-20 2018-12-07 泰华智慧产业集团股份有限公司 Vehicle target detection method and system based on YOLOv2
US20190070500A1 (en) * 2017-09-07 2019-03-07 Line Corporation Method and system for providing game based on video call and object recognition

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407889A (en) * 2016-08-26 2017-02-15 上海交通大学 Video human body interaction motion identification method based on optical flow graph depth learning model
CN106804007A (en) * 2017-03-20 2017-06-06 合网络技术(北京)有限公司 The method of Auto-matching special efficacy, system and equipment in a kind of network direct broadcasting
CN107423721A (en) * 2017-08-08 2017-12-01 珠海习悦信息技术有限公司 Interactive action detection method, device, storage medium and processor
US20190070500A1 (en) * 2017-09-07 2019-03-07 Line Corporation Method and system for providing game based on video call and object recognition
CN107613310A (en) * 2017-09-08 2018-01-19 广州华多网络科技有限公司 A kind of live broadcasting method, device and electronic equipment
CN107750014A (en) * 2017-09-25 2018-03-02 迈吉客科技(北京)有限公司 One kind connects wheat live broadcasting method and system
CN108960185A (en) * 2018-07-20 2018-12-07 泰华智慧产业集团股份有限公司 Vehicle target detection method and system based on YOLOv2

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020200082A1 (en) * 2019-03-29 2020-10-08 广州虎牙信息科技有限公司 Live broadcast interaction method and apparatus, live broadcast system and electronic device
CN110688008A (en) * 2019-09-27 2020-01-14 贵州小爱机器人科技有限公司 Virtual image interaction method and device
US11503377B2 (en) 2019-09-30 2022-11-15 Beijing Dajia Internet Information Technology Co., Ltd. Method and electronic device for processing data
CN110662083B (en) * 2019-09-30 2022-04-22 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN110662083A (en) * 2019-09-30 2020-01-07 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN111028339A (en) * 2019-12-06 2020-04-17 国网浙江省电力有限公司培训中心 Behavior action modeling method and device, electronic equipment and storage medium
CN111028339B (en) * 2019-12-06 2024-03-29 国网浙江省电力有限公司培训中心 Behavior modeling method and device, electronic equipment and storage medium
WO2021147480A1 (en) * 2020-01-22 2021-07-29 北京达佳互联信息技术有限公司 Live broadcast assistance method and electronic device
CN111325851A (en) * 2020-02-28 2020-06-23 腾讯科技(深圳)有限公司 Image processing method and device, electronic equipment and computer readable storage medium
CN111383313A (en) * 2020-03-31 2020-07-07 歌尔股份有限公司 Virtual model rendering method, device and equipment and readable storage medium
CN113539218B (en) * 2020-04-16 2023-11-17 福建凯米网络科技有限公司 Real-time interaction method and terminal for virtual images
CN113539218A (en) * 2020-04-16 2021-10-22 福建凯米网络科技有限公司 Real-time interaction method and terminal for virtual image
CN111918073B (en) * 2020-06-30 2022-11-04 北京百度网讯科技有限公司 Live broadcast room management method and device
CN111918073A (en) * 2020-06-30 2020-11-10 北京百度网讯科技有限公司 Management method and device of live broadcast room
CN111970570B (en) * 2020-07-17 2022-01-25 北京奇艺世纪科技有限公司 Method and device for prompting video content interaction position
CN111970570A (en) * 2020-07-17 2020-11-20 北京奇艺世纪科技有限公司 Method and device for prompting video content interaction position
CN114793286A (en) * 2021-01-25 2022-07-26 上海哔哩哔哩科技有限公司 Video editing method and system based on virtual image
CN113242440A (en) * 2021-04-30 2021-08-10 广州虎牙科技有限公司 Live broadcast method, client, system, computer equipment and storage medium
WO2023279704A1 (en) * 2021-07-07 2023-01-12 上海商汤智能科技有限公司 Live broadcast method and apparatus, and computer device, storage medium and program
WO2023279713A1 (en) * 2021-07-07 2023-01-12 上海商汤智能科技有限公司 Special effect display method and apparatus, computer device, storage medium, computer program, and computer program product
WO2023016167A1 (en) * 2021-08-09 2023-02-16 惠州Tcl云创科技有限公司 Virtual image video call method, terminal device, and storage medium
CN113435431A (en) * 2021-08-27 2021-09-24 北京市商汤科技开发有限公司 Posture detection method, training device and training equipment of neural network model
CN114527877A (en) * 2022-02-22 2022-05-24 广州虎牙科技有限公司 Virtual image driving method and device and server
CN114527877B (en) * 2022-02-22 2024-04-09 广州虎牙科技有限公司 Virtual image driving method, device and server
CN115661942A (en) * 2022-12-15 2023-01-31 广州卓远虚拟现实科技有限公司 Action data processing method and system based on virtual reality and cloud platform

Similar Documents

Publication Publication Date Title
CN109936774A (en) Virtual image control method, device and electronic equipment
Matern et al. Exploiting visual artifacts to expose deepfakes and face manipulations
CN110598610B (en) Target significance detection method based on neural selection attention
CN110148120B (en) Intelligent disease identification method and system based on CNN and transfer learning
Xie et al. Scut-fbp: A benchmark dataset for facial beauty perception
CN105472434B (en) It is implanted into method and system of the content into video display
CN109376603A (en) A kind of video frequency identifying method, device, computer equipment and storage medium
WO2022156640A1 (en) Gaze correction method and apparatus for image, electronic device, computer-readable storage medium, and computer program product
CN108198130B (en) Image processing method, image processing device, storage medium and electronic equipment
CN113609896B (en) Object-level remote sensing change detection method and system based on dual-related attention
CN109214366A (en) Localized target recognition methods, apparatus and system again
Mocanu et al. Deep learning for objective quality assessment of 3d images
CN113192132B (en) Eye catch method and device, storage medium and terminal
Liu et al. Enhanced 3D human pose estimation from videos by using attention-based neural network with dilated convolutions
CN107808376A (en) A kind of detection method of raising one's hand based on deep learning
CN111143617A (en) Automatic generation method and system for picture or video text description
CN114677754A (en) Behavior recognition method and device, electronic equipment and computer readable storage medium
WO2022148248A1 (en) Image processing model training method, image processing method and apparatus, electronic device, and computer program product
CN110188179B (en) Voice directional recognition interaction method, device, equipment and medium
CN111260687A (en) Aerial video target tracking method based on semantic perception network and related filtering
Abawi et al. GASP: gated attention for saliency prediction
CN108492275A (en) Based on deep neural network without with reference to stereo image quality evaluation method
CN112070181A (en) Image stream-based cooperative detection method and device and storage medium
Chan et al. To start automatic commentary of soccer game with mixed spatial and temporal attention
WO2020200082A1 (en) Live broadcast interaction method and apparatus, live broadcast system and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190625