CN114025219B - Rendering method, device, medium and equipment for augmented reality special effects - Google Patents

Rendering method, device, medium and equipment for augmented reality special effects Download PDF

Info

Publication number
CN114025219B
CN114025219B CN202111283354.8A CN202111283354A CN114025219B CN 114025219 B CN114025219 B CN 114025219B CN 202111283354 A CN202111283354 A CN 202111283354A CN 114025219 B CN114025219 B CN 114025219B
Authority
CN
China
Prior art keywords
data
special effect
rendering
augmented reality
video stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111283354.8A
Other languages
Chinese (zh)
Other versions
CN114025219A (en
Inventor
黄业龙
杨爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Boguan Information Technology Co Ltd
Original Assignee
Guangzhou Boguan Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Boguan Information Technology Co Ltd filed Critical Guangzhou Boguan Information Technology Co Ltd
Priority to CN202111283354.8A priority Critical patent/CN114025219B/en
Publication of CN114025219A publication Critical patent/CN114025219A/en
Application granted granted Critical
Publication of CN114025219B publication Critical patent/CN114025219B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the disclosure provides a rendering method and device of an augmented reality special effect, a computer readable medium and terminal equipment, and relates to the technical field of computers; comprising the following steps: acquiring a live video stream of a main broadcasting, identifying and obtaining human body characteristic data of the main broadcasting according to the live video stream, and extracting picture parameters of the video stream; based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, triggering scene data are obtained, wherein the special effect data at least comprise special effect types and special effect triggering time; predicting the rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data; and rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position. By implementing the technical scheme of the embodiment of the disclosure, the accuracy and the rendering efficiency of the augmented reality rendering can be improved.

Description

Rendering method, device, medium and equipment for augmented reality special effects
Technical Field
The disclosure relates to the technical field of computers, in particular to a rendering method of an augmented reality special effect, a rendering device of the augmented reality special effect, a computer readable medium and terminal equipment.
Background
With the development of computer technology, people start to use the augmented reality technology, and by overlaying virtual objects on a real picture, video picture information can be richer.
Taking live broadcast as an example, a great number of augmented reality special effects exist in live broadcast, such as giving a virtual gift. In the superposition of augmented reality effects, the effects should be rendered to the appropriate location on the picture. According to the current scheme, a live video is acquired, a current video picture is visually identified to obtain a rendering position suitable for the current video picture, and then a special effect is rendered. The rendering time is not synchronous and the rendering position is inaccurate due to the identification response delay, so that the rendering quality and efficiency of the special effect are affected.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
An object of an embodiment of the present disclosure is to provide a method for rendering an augmented reality effect, a device for rendering an augmented reality effect, a computer readable medium, and a terminal device, which can improve accuracy and rendering efficiency of augmented reality rendering.
A first aspect of an embodiment of the present disclosure provides a method for rendering an augmented reality effect, including:
acquiring a live video stream of a main broadcasting, identifying and obtaining human body characteristic data of the main broadcasting according to the live video stream, and extracting picture parameters of the video stream;
Based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, triggering scene data are obtained, wherein the special effect data at least comprise a special effect type and special effect triggering time;
predicting a rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data;
And rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
In one exemplary embodiment of the present disclosure, predicting a rendering position of the augmented reality effect in the live video stream according to the trigger scene data and the human feature data includes:
And inputting the trigger scene data and the human body characteristic data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect, and determining a rendering position in the live video stream according to the weight.
In an exemplary embodiment of the present disclosure, after the step of inputting the trigger scene data and the human feature data into a preset algorithm to obtain the predicted rendering position weight of the augmented reality special effect, the method further includes:
and adjusting the predicted rendering position weight according to the special effect type.
In an exemplary embodiment of the present disclosure, acquiring a live video stream of a main cast, identifying human feature data of the main cast according to the live video stream, and extracting video stream picture parameters at the same time, including:
and carrying out face recognition and human body recognition on the live video stream picture to obtain human body characteristic data including face data and mask data, and reading the live video stream to obtain video stream picture parameters.
In an exemplary embodiment of the present disclosure, inputting the trigger scene data and the human feature data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect includes:
Inputting triggering scene data and human body characteristic data when the augmented reality characteristic is triggered into an evaluation function, and calculating a corresponding predicted rendering position weight;
And updating the predicted rendering position weight through a reinforcement learning algorithm, and outputting the updated predicted rendering position weight.
In one exemplary embodiment of the present disclosure, after the step of rendering the augmented reality special effects on the display interface according to the human feature data, the video stream picture parameters, and the rendering position, the method includes:
Taking the human body characteristic data, the video stream picture parameters and the rendering positions as special effect rendering data, taking the special effect rendering data as data frames, and writing the special effect rendering data into a streaming media file together with audio frames and video frames in the live video stream;
and sending the streaming media file to a server, and enabling a spectator to play the live video and render the augmented reality special effect after receiving the streaming media file through the server.
In an exemplary embodiment of the present disclosure, the audience terminal plays live video and renders augmented reality special effects after receiving the streaming media file through the server, including:
Decoding the streaming media file to obtain an audio frame, a video frame and a data frame;
Acquiring a live video picture of the main broadcasting according to the video frame, and identifying the live video picture of the main broadcasting to obtain live scene data;
Predicting a rendering position of the augmented reality effect at a viewer end based on the live scene data and the effect type, wherein the effect type is extracted from the data frame;
and playing live video on the display interface according to the audio frames and the video frames, and rendering the augmented reality special effect according to the rendering position weight.
In an exemplary embodiment of the present disclosure, after the step of playing live video on the display interface according to the audio frame and the video frame and performing the rendering step of the augmented reality special effect at the rendering position of the viewer end, the method further includes:
Generating new view and projection matrix according to the play setting parameters of the audience and the video stream picture parameters, and correcting special effect rendering positions by applying the new view and projection matrix, wherein the video stream picture parameters are extracted from the data frames.
According to a second aspect of embodiments of the present disclosure, there is provided a rendering apparatus of an augmented reality effect, including:
The video analysis module is used for acquiring a live video stream of the anchor, identifying and obtaining human body characteristic data of the anchor according to the live video stream, and extracting picture parameters of the video stream;
The data acquisition module is used for acquiring triggering scene data based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, wherein the special effect data at least comprises a special effect type and special effect triggering time;
The rendering position prediction module is used for predicting the rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data;
And the interface rendering module is used for rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
In an exemplary embodiment of the present disclosure, a rendering position prediction module is configured to input the trigger scene data and the human feature data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect, and determine a rendering position in the live video stream according to the weight.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes: the weight adjusting module is used for inputting the trigger scene data and the human body characteristic data into a preset algorithm, and adjusting the predicted rendering position weight according to the special effect type after the predicted rendering position weight of the augmented reality special effect is obtained.
In an exemplary embodiment of the present disclosure, a video analysis module is configured to perform face recognition and human body recognition on a live video stream frame to obtain human body feature data including face data and mask data, and read the live video stream to obtain video stream frame parameters.
In an exemplary embodiment of the present disclosure, the data obtaining module is configured to obtain human feature data and picture parameters according to the video stream data analysis, where the obtaining includes: performing face recognition and human body recognition on the image of the video stream to obtain human body characteristic data comprising face data and mask data; and reading the live video stream to obtain video stream picture parameters.
In an exemplary embodiment of the present disclosure, a rendering position prediction module is configured to input trigger scene data and human feature data when an augmented reality feature is triggered into an evaluation function, and calculate a corresponding predicted rendering position weight;
And updating the predicted rendering position weight through a reinforcement learning algorithm, and outputting the updated predicted rendering position weight.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes:
the streaming media generation module is used for taking the human body characteristic data, the video stream picture parameters and the rendering positions as special effect rendering data and taking the special effect rendering data as data frames after the augmented reality special effect is rendered on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering positions, and writing the special effect rendering data into streaming media files together with the audio frames and the video frames in the live video stream;
and sending the streaming media file to a server, and enabling a spectator to play the live video and render the augmented reality special effect after receiving the streaming media file through the server.
In an exemplary embodiment of the present disclosure, the interface rendering module is further configured to decode the streaming media file to obtain an audio frame, a video frame, and a data frame;
Acquiring a live video picture of the main broadcasting according to the video frame, and identifying the live video picture of the main broadcasting to obtain live scene data;
Predicting a rendering position of the augmented reality effect at a viewer end based on the live scene data and the effect type, wherein the effect type is extracted from the data frame;
And playing live video on the display interface according to the audio frames and the video frames, and rendering the augmented reality special effect at the rendering position of the audience terminal.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes: the rendering correction module is used for playing live video on the display interface according to the audio frame and the video frame, generating a new view and a projection matrix according to the play setting parameters of the audience end and the video stream picture parameters of the special effect rendering data in the data frame after the rendering of the augmented reality special effect is carried out according to the rendering position weight, and correcting the special effect rendering position by applying the new view and the projection matrix, wherein the video stream picture parameters are extracted from the data frame.
According to a third aspect of embodiments of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method of rendering an augmented reality effect as described in the first aspect of the above embodiments.
According to a fourth aspect of embodiments of the present disclosure, there is provided a terminal device, including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method for rendering augmented reality effects as described in the first aspect of the above embodiments.
According to a fifth aspect of the present disclosure, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the methods provided in the various alternative implementations described above.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:
According to the technical scheme provided by some embodiments of the present disclosure, after the live video stream of the anchor is acquired, the human body characteristic data and the video stream picture parameters can be acquired according to the live video stream. And further acquiring trigger scene data, and predicting the rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data. And finally, rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position. According to the embodiment of the disclosure, on one hand, by analyzing the characteristics of the main broadcasting human body and enhancing the triggering scene for displaying the special effect, the special effect rendering position is predicted in advance, so that high delay caused by the fact that the current video picture rendering position is determined through visual recognition and then rendered can be avoided, and the efficiency of augmented reality rendering is improved. On the other hand, the rendering position based on the characteristics of the anchor human body and the trigger scene confirmation is more accurate, so that the display effect of augmented reality is optimized, and the display quality is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 schematically illustrates a schematic diagram of an exemplary system architecture of a rendering method of an augmented reality effect and a rendering device of an augmented reality effect to which embodiments of the present disclosure may be applied;
FIG. 2 schematically illustrates a structural diagram of a computer system suitable for use in implementing embodiments of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a method of rendering augmented reality special effects according to one embodiment of the disclosure;
FIG. 4 schematically illustrates an interface discrimination diagram of augmented reality special effects according to one embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow diagram of a method of rendering augmented reality special effects in accordance with one embodiment of the present disclosure;
FIG. 6 schematically illustrates a flow diagram of a method of rendering augmented reality special effects in accordance with one embodiment of the present disclosure;
FIG. 7 schematically illustrates a flow diagram of a method of rendering augmented reality special effects in accordance with one embodiment of the present disclosure;
FIG. 8 schematically illustrates a functional unit diagram according to one embodiment of the present disclosure;
fig. 9 schematically illustrates a block diagram of a structure of a rendering apparatus of an augmented reality effect according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
FIG. 1 illustrates a schematic diagram of a system architecture of an exemplary application environment in which a game interface interaction method and a game interface interaction device of embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of terminal devices 101, 102, a network 103, and a server 104. The network 103 is the medium used to provide communication links between the terminal devices 101, 102 and the server 104. The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The terminal devices 101, 102 may be a variety of electronic devices with display screens including, but not limited to, desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 104 may be a server cluster formed by a plurality of servers. The terminal device is configured to perform: acquiring a live video stream of a main broadcast, and acquiring human body characteristic data and video stream picture parameters of the main broadcast according to the live video stream; based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, triggering scene data are obtained, wherein the special effect data at least comprise a special effect type and special effect triggering time; predicting a rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data; and rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
Fig. 2 shows a schematic diagram of a computer system architecture of a terminal device suitable for use in implementing embodiments of the present disclosure.
It should be noted that the terminal device 200 shown in fig. 2 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 2, the terminal device 200 includes a Central Processing Unit (CPU) 201, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In (RAM) 203, various programs and data required for system operation are also stored. The (CPU) 201, (ROM) 202, and (RAM) 203 are connected to each other through a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.
The following components are connected to the (I/O) interface 205: an input section 206 including a keyboard, a mouse, and the like; an output portion 207 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage section 208 including a hard disk or the like; and a communication section 209 including a network interface card such as a LAN card, a modem, and the like. The communication section 209 performs communication processing via a network such as the internet. The drive 210 is also connected to the (I/O) interface 205 as needed. A removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 210 as needed, so that a computer program read out therefrom is installed into the storage section 208 as needed.
In particular, according to embodiments of the present disclosure, the processes described below with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 209, and/or installed from the removable medium 211. The computer program, when executed by a Central Processing Unit (CPU) 201, performs the various functions defined in the methods and apparatus of the present disclosure.
It should be noted that the computer readable medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As another aspect, the present disclosure also provides a computer-readable medium that may be contained in the terminal device described in the above embodiment; or may exist alone without being fitted into the terminal device. The computer-readable medium carries one or more programs which, when executed by one of the terminal devices, cause the terminal device to implement the method as described in the following embodiments. For example, the terminal device may implement the steps shown in fig. 3, and so on.
To facilitate an understanding of the embodiments of the present disclosure, the following description explains some of the terms to which the present disclosure pertains:
Augmented reality: augmented reality (Augmented Reality, abbreviated as AR), which is also called augmented reality, can bring together real world information and virtual world information content, and performs analog simulation processing on the basis of scientific technology such as a computer on the basis of which entity information that is otherwise difficult to experience in a spatial range of the real world is difficult to experience, and overlaps virtual information content in the real world, and after overlapping the real environment and virtual objects, the virtual information content can exist in the same picture and space at the same time. And in this process can be perceived by human senses, thereby achieving a sensory experience that exceeds reality.
Referring to fig. 3, fig. 3 schematically illustrates a flowchart of a method of rendering augmented reality special effects according to one embodiment of the disclosure. As shown in fig. 3, the method for rendering the augmented reality special effect is applied to a terminal device with a display interface, and may include:
Step S310: and acquiring the live video stream of the anchor, and acquiring the human body characteristic data and the video stream picture parameters of the anchor according to the live video stream.
Step S320: and obtaining triggering scene data based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, wherein the special effect data at least comprises a special effect type and special effect triggering time.
Step S330: and predicting the rendering position of the augmented reality special effect in the live video stream according to the trigger scene data and the human body characteristic data.
Step S340: and rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
By implementing the method for rendering the augmented reality special effect shown in fig. 3, the augmented reality special effect can be rendered on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position. According to the embodiment of the disclosure, on one hand, by analyzing the characteristics of the main broadcasting human body and enhancing the triggering scene of the display special effect and predicting the rendering position, the high delay of determining the rendering position of the current picture through visual identification can be avoided, and the efficiency of the augmented reality rendering is improved. On the other hand, based on the characteristics of the anchor human body and the trigger scene, the rendering position is more accurate, the display effect of the augmented reality special effect is optimized, and the special effect rendering quality and efficiency are improved.
Next, the above steps of the present exemplary embodiment will be described in more detail.
In step S310, a live video stream of the anchor is acquired, human feature data of the anchor is obtained according to the identification of the live video stream, and picture parameters of the video stream are extracted.
In the embodiment of the present disclosure, the augmented reality special effect is commonly displayed in a plurality of terminal devices, and the augmented reality special effect may be triggered by any one of the terminal devices. For example, in a live broadcast scene, both the anchor and the audience can trigger the augmented reality special effect through some augmented reality portals preset by the live broadcast application program in respective terminal equipment, and the augmented reality portals are arranged in the display interface range of the application program, and can also be out of the display range of the application program. And if the anchor selects the augmented reality template, the augmented reality characteristics corresponding to the augmented reality template can be triggered. Optionally, the anchor selects the augmented reality template in various manners, for example, in the intelligent mobile terminal device, touch the augmented reality template on the real interface, click, double click, long press or other operations, and use a mouse to operate on the computer terminal device; for example, the method can judge the augmented reality special effect to be selected by recognizing the anchor voice; the selection operation may also be accomplished, for example, by an entity key of the terminal device. The disclosed embodiments are not particularly limited here as to how the augmented reality is triggered.
The live video stream of the host broadcasting is used for providing the picture of the whole live broadcasting room, and the live video stream can at least comprise two parts, wherein one part is a camera picture, and the other part is a picture for displaying other functions of the host broadcasting room, such as a ranking list area of the host broadcasting vermicelli, a host broadcasting personal information receiving area and the like. The camera pictures are collected by cameras which are arranged or connected at the anchor terminal equipment, and the video stream records the condition of the live broadcast pictures. The live broadcast picture in the video stream data can be identified to obtain the human body characteristic data. The body characteristic data may be derived from the group consisting of: corresponding data obtained by detecting actions, faces and skeleton feature points of the human body or other data for representing the characteristics of the human body. The object of the human body characteristic data may be not only a main cast but also other people appearing in a live broadcast picture. The video stream picture parameters are parameters for describing the live video picture of the anchor, and may include picture proportion, relative position of the camera picture on the live video picture, relative size of the camera picture and the live video picture, picture frame rate, and the like. The embodiments of the present disclosure are not particularly limited herein to human feature data and video stream picture parameters.
In step S320, trigger scene data is obtained based on the human body feature data and special effect data generated when the augmented reality special effect is triggered, wherein the special effect data at least includes a special effect type and a special effect trigger time.
The trigger scene data is some data generated in the live broadcast process and in the trigger augmented reality, and can comprise a real live broadcast environment triggering augmented reality, a time point triggering, a live broadcast type and the like. In the embodiment of the disclosure, the triggering scene data may include a live broadcast type, a type of an augmented reality special effect and a special effect triggering time, and the specific condition of the special effect triggering at the time is described from multiple angles through the data content included in the triggering scene data. By the live broadcast type, the type of the augmented reality special effect and the special effect triggering time, what special effect is triggered at what time point under what live broadcast type, for example, a first special effect is triggered at two afternoon in outdoor running live broadcast.
The live broadcast type can be divided according to different classification standards, and the live broadcast type is judged through human body characteristic data obtained by the live broadcast video stream of the main broadcast. If the main broadcasting is judged to be frequently in sitting live broadcasting according to the human body characteristic data, the main broadcasting can be considered as stationary live broadcasting; if the anchor is judged to be frequently moved according to the human body characteristic data, the anchor can be regarded as sports live broadcast. For example, the video may be classified according to the position of the anchor in the screen, and the anchor is often broadcast on the right side of the screen, and then the video may be of the right-hand type. The embodiments of the present disclosure are not limited specifically herein to live types, and in specific cases, different type discrimination criteria may be set.
Based on different expression modes and characteristics of the augmented reality, the type classification of different systems can be performed. For example by colour classification; the embodiments of the present disclosure do not specifically limit the types of augmented reality effects herein according to the object classification characterized by the augmented reality effect, such as a flower-like augmented reality effect, a digital-like augmented reality effect, a vehicle-like augmented reality effect, an aircraft-like augmented reality effect, or an augmented reality effect type generated based on other classification criteria.
In step S330, according to the trigger scene data and the human feature data, predicting a rendering position of the augmented reality special effect in the live video stream;
In the embodiments of the present disclosure, since the special effects of augmented reality often come from objects that actually exist in life, people have a fixed awareness of these objects. Such as airplanes, necklaces, the effects of augmented reality are not achieved if the augmented reality effect is not rendered in place, causing people to feel offensive. For example, rendering an aircraft below a display interface, it is contrary to the general knowledge that the aircraft is below; such as rendering necklace effects over the display interface, such as over the head of the anchor, rather than in the middle of the display interface, such as in the neck, also violates the consistent knowledge of the necklace by the person. In addition, the position of the effect rendering should not conflict with the subject elements displayed on the display interface. For example, the effect is rendered on the face of the anchor and the effect is rendered on the introduced object of the introduction class live. Therefore, the rendering position of the special effect directly affects the effect that the augmented reality can exhibit.
It can be understood that the trigger scene data is scene data when a special effect is triggered in live broadcast, features of a live broadcast scene can be described, and the human body feature data is used for describing features of a host or other persons related to the live broadcast scene. Based on the trigger scene data and the human feature data, factors that have an impact on the augmented reality special effect rendering location may be covered. For example, in live game broadcasting, by triggering scene data, it is determined that a main broadcasting is still sitting and broadcasting, special effect types such as sports cars are used, and the main broadcasting is positioned in the middle of a picture in combination with the human body characteristic data of the main broadcasting, then the rendering position of the special effect is predicted to be positioned in the middle and lower part of the live video picture.
In step S340, an augmented reality special effect is rendered on the display interface according to the human feature data, the video stream picture parameters and the rendering position.
In the embodiment of the disclosure, one or more kinds of human body characteristic data can be adopted, the human face characteristic point data can be adopted, the characteristic points can be 21, 68, 77 or other quantity of characteristic points, and the more the characteristic points are, the higher the accuracy is; the face offset coordinates (x, y, z) may be the face pose estimation, and may be the human mask data, which may be only the contour extracted data, to reduce the data amount. The selection of specific human body characteristic data and picture parameters is not particularly limited herein in this disclosure. Furthermore, since the face feature data is obtained by identifying the anchor or other people in the camera frame, the face feature data can also be corrected according to the video stream frame parameters, such as the relative position of the camera frame on the live video frame and the relative size of the camera frame and the live video frame.
In addition, a uniform resource locator (Uniform Resource Locator, abbreviated as URL) can be used to instruct the download path of the augmented reality special effect to obtain the special effect file corresponding to the special effect type.
In the embodiment of the present disclosure, as shown in fig. 4, the content of the terminal display interface is differentiated, and the following may be generally divided: a software main window 401, which is a window range of the whole live broadcast application program, is a part of a terminal display interface, and can scale and adjust the size of the range; the area where the augmented reality effect can be displayed is used for the area where the augmented reality effect is consistent with the size of the whole software main window; a video display area 402 for displaying live pictures, the video display area being part of the software main window, and also being scalable to the size of the range. Some of the video display area 402 is a camera screen area 403. The embodiments of the present disclosure herein do not impose particular restrictions on the content differentiation of the terminal display interface.
In the embodiment of the disclosure, the rendering of the augmented reality effect on the display interface may be by creating an augmented reality window at a rendering position on the display interface, the augmented reality window may be a semitransparent window, and the augmented reality window is within a window range of the live application program. Determining a rendering position through prediction, and creating an augmented reality window at the position for rendering. Using the rendering data, an augmented reality effect is rendered by the augmented reality player. The augmented reality player is a special player capable of playing videos and playing augmented reality special effects outside the videos, and a common video player can only see video pictures and hear sounds and cannot see the augmented reality special effects outside the videos. In addition, a countdown prompt may be provided on the display interface prior to the augmented reality effect rendering to prompt the host and user that the augmented reality effect will be rendered after a certain period of time.
In one embodiment of the present disclosure, there is further provided an implementation method for predicting a rendering position of the augmented reality effect in the live video stream according to the trigger scene data and the human feature data, including:
And inputting the trigger scene data and the human body characteristic data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect, and determining a rendering position in the live video stream according to the weight.
In the embodiment of the disclosure, the preset algorithm is preset in advance, and the algorithm is input according to the triggering scene data and the human body characteristic data of each triggering of the augmented reality special effect so that the triggered augmented reality special effect is slowly adapted to the current live broadcast scene. And acquiring trigger scene data and human body characteristic data once every time of triggering the special effect, and taking the trigger scene data and the human body characteristic data as inputs for calculating the predicted rendering position weight. The preset algorithm can be an algorithm for selecting behaviors through value, directly selecting behaviors, imagining environments and learning three categories from the behaviors. For example, the value selection behavior may be selected from the group consisting of Q-learning algorithm, sarsa algorithm, deep Q Network (DQN) algorithm. The embodiments of the present disclosure are not particularly limited herein to the preset algorithm.
Further, in an embodiment of the present disclosure, after the step of inputting the trigger scene data and the human feature data into a preset algorithm to obtain the predicted rendering position weight of the augmented reality special effect, the method further includes:
and adjusting the predicted rendering position weight according to the special effect type.
It will be appreciated that the effects rendered by augmented reality are many from objects in real life, such as airplanes, flowers, yachts. Based on the characteristics of these real-life objects, the rendering position of the corresponding augmented reality effect should be in a reasonable range. For example, an airplane is above the picture and a fresh flower is below the picture. For example, a mapping relationship can be established based on the type of the augmented reality effect and different real interface regions of the picture, and a larger weight is given to the region on the display interface corresponding to the type of the augmented reality effect, so that the effect is more likely to be displayed in the corresponding region, and the predicted rendering position weight is adjusted differently according to the type of the augmented reality effect.
In the embodiment of the disclosure, the output predicted rendering position weight is further adjusted through the type of the augmented reality special effect, so that the accuracy of the final output predicted rendering position weight can be improved, and the rendering quality of the augmented reality special effect is improved.
In an embodiment of the present disclosure, there is further provided an implementation method for obtaining live video stream of a main cast, and obtaining human feature data and video stream picture parameters of the main cast according to the live video stream, including:
And carrying out face recognition and human body recognition on the live video stream picture to obtain human body characteristic data including face data and mask data, and reading the live video stream to obtain video stream picture parameters.
In the embodiment of the disclosure, the face data may include face feature points and face pose estimation. Optionally, the process of obtaining the feature points of the face is performed on the basis of face detection, and the feature points on the face, such as corners of mouth and corners of eyes, are positioned. The cascade regression CNN face feature point detection, dlib face feature point detection, libfacedetect face feature point detection and SEETAFACE face feature point detection methods can be adopted. For example, when dlib is used to implement face feature point detection, a model trained based on dlib is required. Then 68 points are marked by using the model, the OpenCv is adopted for imaging processing, 68 points are drawn on the face, and serial numbers are marked. Other face feature point detection methods may also be employed, and the embodiments of the present disclosure are not limited in particular to how the face feature points are obtained.
Face pose estimation involves determining the face pose through three euler angles, pitch, yaw and roll. Where pitch represents pitch angle (rotation angle about x-axis), yaw represents yaw angle (rotation angle about y-axis), roll represents roll angle (rotation angle about z-axis). Respectively represents the angles of up-down overturn, left-right overturn and in-plane rotation. A model-based approach, an appearance-based approach, and a classification-based approach may be employed for face pose estimation. For example, using a model-based approach, a 3D face model is first defined with n keypoints, n being defined according to the tolerance of accuracy. The more keypoints, the higher the accuracy. For example, selecting a 3D facial model of 6 key points (left eye corner, right eye corner, tip of nose, left mouth corner, right mouth corner, mandible); face detection and face key point detection are adopted to obtain 2D face key points corresponding to the 3D face model; solving a rotation vector by adopting a solvePnP function of Opencv; the rotation vector is converted into euler angles. Other face pose estimation methods may be used, and the embodiment is not particularly limited herein.
The human body is identified, and a threshold-based segmentation method, a region-based segmentation method and an edge-based segmentation method can be adopted. For example, the threshold segmentation method sets T as a threshold by a transformation of the input image f to the output image g; for image elements of the object, g (i, j) =1, for image elements of the background, g (i, j) =0. The key of the threshold segmentation algorithm is to determine a threshold value, determine a proper threshold value, compare the threshold value with the gray value of the pixel point and carry out pixel segmentation on each pixel in parallel, and the segmentation result directly gives an image area. And acquiring human body mask data based on the segmented image region result, extracting the outline of the mask data, and punctuating the outline coordinates. The present embodiment is not particularly limited here as to how the human mask data is acquired.
In the disclosed embodiment, the video stream picture parameter may include a camera picture original size (camera_size). Since the camera view is actually only a portion of a complete live video view, the video stream view parameters also include the actual position of the camera view in the video view. The video stream picture parameters may include the resolution of the video.
Optionally, encapsulating the acquired face data and video stream picture parameters in a json mode; and (3) storing the outline coordinate punctuation of the human body mask data by using a 2-system, and then performing base64 coding, and putting the outline coordinate punctuation into a json structure so as to facilitate data transmission.
As shown in fig. 5, fig. 5 schematically illustrates a flowchart of a method of rendering augmented reality special effects according to one embodiment of the present disclosure. In one embodiment of the present disclosure, inputting the trigger scene data and the human feature data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect, including:
step S510, inputting the trigger scene data and the human body characteristic data when the augmented reality characteristic is triggered into an evaluation function, and calculating the predicted rendering position weight corresponding to the trigger scene data and the human body characteristic data when the augmented reality is triggered.
The main task of the evaluation function is to estimate the importance level of each rendering position to determine the priority level. In the evaluation function, various evaluation indexes may be used, such as Root Mean Square Error (RMSE), R-square (R2), mean Absolute Percentage Error (MAPE), mean Absolute Error (MAE), hill unequal coefficient (TIC), and the like. The triggering scene data and the human body characteristic data are factors influencing the rendering position, and after each augmented reality triggering, an evaluation function is input according to the specific triggering scene data and the human body characteristic data during triggering to obtain the predicted rendering position weight of the live broadcast condition during triggering.
Step S520, updating the predicted rendering position weight by reinforcement learning algorithm, and outputting the updated predicted rendering position weight.
In an embodiment of the present disclosure, the predicted rendering position weight is updated. An action cost function, a direct calculation strategy function (policy-based), and an estimation model (model-based), or other reinforcement learning algorithm may be employed. For example, the Bellman (Bellman) formula is used: newQ =q+α [ r+γmaxq' -Q ] iteratively updates the weight, wherein NewQ represents the weight after the iterative update, and Q represents the previous weight of a certain trigger scene. And alpha is a weight bias parameter, and the parameter is artificially adjusted. R represents the prize value calculated before, gamma is the weight of each prize and punish value, gamma maxQ' is the optimal weight possibly predicted in the next step, and finally the iteratively updated weight of the predicted rendering position is obtained. The calculation result is updated continuously through the reinforcement learning algorithm, so that the trigger scene of the multi-type augmented reality special effect can be adapted.
In the embodiment of the disclosure, the predicted rendering position weight of each augmented reality is calculated through the evaluation function, and is iteratively updated through the reinforcement learning algorithm, so that the prediction of the rendering position is more accurate, the rendering position determining process is more efficient, and the rendering quality of the augmented reality special effect is ensured.
As shown in fig. 6, fig. 6 schematically illustrates a flowchart of a method of rendering augmented reality special effects according to one embodiment of the present disclosure. In one embodiment of the present disclosure, after the step of rendering the augmented reality special effects on the display interface according to the human feature data, the video stream picture parameters, and the rendering position, the method comprises:
Step S610, taking the human body feature data, the video stream picture parameters and the rendering position as special effect rendering data, taking the special effect rendering data as a data frame, and writing the special effect rendering data into a streaming media file together with an audio frame and a video frame in the live video stream.
In the embodiment of the disclosure, for example, in a live service, a viewer terminal is a terminal device that accepts live video sent by a live server terminal. The audience end needs to acquire complete data including audio data, video data and special effect rendering data to perform augmented reality rendering, namely, a real live broadcast picture is constructed through the audio data and the video data, and the augmented reality special effect is further overlapped and rendered on the basis of the real picture. The special effects rendering data may thus be written as data frames, together with audio and video frames, to the streaming media file.
Alternatively, a real-time messaging Protocol (REAL TIME MESSAGING Protocol, RTMP for short), a real-time streaming Protocol (REAL TIME STREAMING Protocol, RTSP for short), or other protocols may be used for streaming the streaming media file. For example, an RTMP protocol may be employed, where the RTMP stream of the RTMP protocol is composed of video frames, audio frames, and data frames. The video frames, the audio frames and the data frames are provided with the same time stamp when being written into the RTMP stream, namely each frame of video, each frame of audio and each frame of special effect rendering data have the same time stamp. The method ensures that when the video is played and the special effect of the augmented reality is rendered, the time dislocation can not occur, and the problem of asynchronous effect rendering and video pictures can be avoided.
Step S620, sending the streaming media file to a server, and after receiving the streaming media file, the viewer end plays the live video and renders the augmented reality special effect through the server.
In many cases, augmented reality effects need to be rendered on multiple devices, such as in live traffic or in video calls. For example, in live broadcast, if a spectator gives a gift to a main broadcast, the spectator needs to render the gift effect on its own terminal device, the main broadcast needs to render the gift effect, and each spectator terminal device watching live broadcast needs to render the gift effect. In live broadcast, live video is generated at a main broadcasting end, the main broadcasting end pushes the live video to a server end, and a spectator end pulls the live video from the server end to a terminal of the spectator end for display. Because the live video picture is pushed to the live server by the anchor terminal, the anchor terminal does not need to pull and acquire video stream data from the server again, no transmission of the video stream exists, no time delay caused by video stream transmission exists, and the rendering data used by the anchor terminal can be uncompressed special effect rendering data.
When the audience receives the streaming media file, whether the streaming media file contains data frames or not, namely whether special effect rendering data for rendering the augmented reality special effect is contained or not is detected. When special effect rendering data exist, the audience end creates an augmented reality player on a display interface while playing the video, and the special effect rendering data is utilized to render the augmented reality special effect in the player.
As shown in fig. 7, fig. 7 schematically illustrates a flowchart of a method of rendering augmented reality special effects according to one embodiment of the present disclosure. In one embodiment of the present disclosure, the audience terminal plays live video and renders the augmented reality special effect after receiving the streaming media file through the server, including:
in step S710, the streaming media file is decoded to obtain an audio frame, a video frame and a data frame.
Because live broadcast is low-delay and real-time picture transmission, the received RTMP stream needs to be continuously decoded, and video frames, audio frames and data frames are decoded according to the standard format of RTMP.
Step S720, acquiring a live video picture of the main broadcasting according to the video frame, and identifying the live video picture of the main broadcasting to obtain live scene data.
And identifying live broadcast pictures displayed by the video frames to obtain live broadcast scene data of the current triggering augmented reality special effect, wherein the live broadcast scene data comprises live broadcast scenes and live broadcast types. The live broadcast scene is used for identifying the real environment where the anchor is located.
Step S730, inputting the live scene data and the special effect type into a preset algorithm, and predicting the rendering position of the augmented reality special effect at the audience end.
The embodiment of the disclosure does not describe the live scene data and the special effect type of the augmented reality, outputs the predicted rendering position weight value on the audience terminal based on the same calculation process, and judges the rendering position according to the weight value.
And step 740, playing live video on the display interface according to the audio frame and the video frame, and rendering the augmented reality special effect according to the rendering position of the audience terminal.
In one embodiment of the present disclosure, after the step of playing live video on the display interface according to the audio frame and the video frame and performing the rendering step of the augmented reality special effect according to the rendering position weight, the method further includes:
And generating a new view and projection matrix according to the play setting parameters of the audience end and the video stream picture parameters of the special effect rendering data in the data frame, and correcting the special effect rendering position by applying the new view and projection matrix.
According to the playing setting parameters of the audience, including the size (canvas_size) of the window of the augmented reality player and the position Rect1 of the video picture relative to the main window in the video player, the original video size (video_size), the face gesture estimation in the rendering data, and the camera picture relative to the original video position Rect0, a new view V and a projection matrix P are generated. And then, according to the time stamp in the augmented reality special effect at a corresponding time point, a new V, P matrix is applied to play the augmented reality special effect for correcting the rendering position of the preset augmented reality special effect.
In the embodiment of the disclosure, after the predicted rendering position weight of the audience terminal is rendered, the rendering position of the augmented reality feature is further adjusted based on the size of the additional augmented reality player window, the size of the original video and a plurality of relative position relations, so that the accuracy of the rendering position can be improved.
In connection with a specific scenario, the main functions of the anchor side and the main functions of the viewer side may be embodied by fig. 8. As shown in fig. 8, the anchor end has an audio/video writing unit, through which audio frames and video frames are written into the streaming media file; human body characteristic data comprising human face data and mask data are obtained by carrying out human face recognition and human body identification through a human body data recognition module; the special effect rendering data generated by the anchor end is compressed through the special effect rendering data compression unit, so that the transmission pressure of the streaming media file is reduced, and the transmission efficiency is improved; and the rendering position learning prediction module is used for predicting the rendering position of the augmented reality special effect in advance so as to improve the accuracy and efficiency of rendering.
And forwarding the streaming media file between the anchor end and the audience end through a server to enable the audience end to acquire live video. The streaming media file includes three types of data, namely an audio frame, a video frame, and a special effect rendering data frame.
The audience terminal is provided with an audio and video playing unit which is used for extracting audio frames and video frames in the streaming media file and playing common live video; the system also comprises a special effect rendering data extraction unit, a processing unit and a processing unit, wherein the special effect rendering data extraction unit is used for decompressing and processing special effect rendering data; the system also comprises a special effect rendering position processing module, which is used for predicting the rendering position of the augmented reality special effect at the audience end according to the live scene data and the special effect type; and finally, the method comprises a special effect display module which is used for completing the rendering of the augmented reality special effect at the rendering position of the audience terminal.
Further, in this example embodiment, a rendering apparatus for an augmented reality special effect is also provided. Referring to fig. 9, the apparatus 900 for rendering an augmented reality effect may include:
the video analysis module 901 is used for acquiring a live video stream of a host, and acquiring human body characteristic data and video stream picture parameters of the host according to the live video stream;
The data obtaining module 902 is configured to obtain trigger scene data based on the human feature data and special effect data generated when the augmented reality special effect is triggered, where the special effect data at least includes a special effect type and a special effect trigger time;
A rendering position prediction module 903, configured to predict a rendering position of the augmented reality effect in the live video stream according to the trigger scene data and the human feature data;
and an interface rendering module 904, configured to render an augmented reality special effect on the display interface according to the human feature data, the video stream picture parameter and the rendering position.
In an exemplary embodiment of the present disclosure, a rendering position prediction module 903 is configured to input the trigger scene data and the human feature data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect, and determine a rendering position in the live video stream according to the weight.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes: the weight adjusting module is used for inputting the trigger scene data and the human body characteristic data into a preset algorithm, and adjusting the predicted rendering position weight according to the special effect type after the predicted rendering position weight of the augmented reality special effect is obtained.
In an exemplary embodiment of the present disclosure, a video analysis module 901 is configured to perform face recognition and human body recognition on a live video stream frame to obtain human body feature data including face data and mask data, and read the live video stream to obtain video stream frame parameters.
In an exemplary embodiment of the present disclosure, the data obtaining module 902 is configured to obtain human feature data and picture parameters according to the video stream data analysis, where the steps include: performing face recognition and human body recognition on the image of the video stream to obtain human body characteristic data comprising face data and mask data; and analyzing the video stream data, wherein the read picture parameters comprise shooting picture parameters and video stream picture parameters.
In an exemplary embodiment of the present disclosure, a rendering position prediction module 903 is configured to input trigger scene data and human feature data when triggering an augmented reality feature into an evaluation function, and calculate a predicted rendering position weight corresponding to the trigger scene data and the human feature data when triggering the augmented reality feature;
And iteratively updating the predicted rendering position weight through a reinforcement learning algorithm, and outputting the iteratively updated predicted rendering position weight.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes:
the streaming media generation module is used for taking the human body characteristic data, the video stream picture parameters and the rendering positions as special effect rendering data and taking the special effect rendering data as data frames after the augmented reality special effect is rendered on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering positions, and writing the special effect rendering data into streaming media files together with the audio frames and the video frames in the live video stream;
and sending the streaming media file to a server, and enabling a spectator to play the live video and render the augmented reality special effect after receiving the streaming media file through the server.
In an exemplary embodiment of the present disclosure, the interface rendering module 904 is further configured to decode the streaming media file to obtain an audio frame, a video frame, and a data frame;
Acquiring a live video picture of the main broadcasting according to the video frame, and identifying the live video picture of the main broadcasting to obtain live scene data;
Inputting the live scene data and the special effect type into a preset algorithm, and predicting the rendering position of the augmented reality special effect at the audience side;
and playing live video on the display interface according to the audio frames and the video frames, and rendering the augmented reality special effect according to the rendering position weight.
In an exemplary embodiment of the present disclosure, the apparatus for rendering an augmented reality effect further includes: the rendering correction module is used for playing live video on the display interface according to the audio frames and the video frames, generating a new view and a projection matrix according to the play setting parameters of the audience end and the video stream picture parameters of the special effect rendering data in the data frames after the rendering of the augmented reality special effect is carried out according to the rendering position weight, and correcting the special effect rendering position by applying the new view and the projection matrix.
The foregoing apparatus is used for executing the method provided in the foregoing embodiment, and its implementation principle and technical effects are similar, and are not described herein again.
The above modules may be one or more integrated circuits configured to implement the above methods, for example: one or more Application SPECIFIC INTEGRATED Circuits (ASIC), or one or more microprocessors (DIGITAL SINGNAL Processor DSP), or one or more field programmable gate arrays (Field Programmable Gate augmented reality (FPGA)) or the like. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a System-on-a-chip (SOC).
Optionally, the present invention also provides a program product, such as a computer readable storage medium, comprising a program for performing the above-described method embodiments when being executed by a processor.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the invention. And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
The foregoing is merely illustrative of embodiments of the present application, and the present application is not limited thereto, and any changes or substitutions can be easily made by those skilled in the art within the technical scope of the present application, and the present application is intended to be covered by the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims (9)

1. The method for rendering the augmented reality special effect is characterized by being applied to a terminal device with a display interface and comprising the following steps:
acquiring a live video stream of a main broadcasting, identifying and obtaining human body characteristic data of the main broadcasting according to the live video stream, and extracting picture parameters of the video stream;
Based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, triggering scene data at least comprising a special effect type and special effect triggering time, wherein the triggering scene data is scene data when the special effect is triggered in live broadcast;
Inputting the trigger scene data and the human body characteristic data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect;
according to the special effect type, adjusting the predicted rendering position weight;
determining a rendering position in the live video stream according to the weight;
And rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
2. The method according to claim 1, wherein the obtaining the live video stream of the anchor, identifying the human feature data of the anchor according to the live video stream, and extracting the video stream picture parameters at the same time, includes:
And carrying out face recognition and human body recognition on the live video stream picture to obtain human body characteristic data including face data and mask data, and reading the live video stream to obtain video stream picture parameters.
3. The method according to claim 1, wherein the inputting the trigger scene data and the human feature data into a preset algorithm to obtain the predicted rendering position weight of the augmented reality effect comprises:
inputting triggering scene data and human body characteristic data when the augmented reality characteristic is triggered into an evaluation function, and calculating a corresponding predicted rendering position weight;
And updating the predicted rendering position weight through a reinforcement learning algorithm, and outputting the updated predicted rendering position weight.
4. The method of claim 1, further comprising, after the step of rendering the augmented reality effect on the display interface according to the human body feature data, the video stream picture parameters, and the rendering location:
Taking the human body characteristic data, the video stream picture parameters and the rendering positions as special effect rendering data, taking the special effect rendering data as data frames, and writing the special effect rendering data into a streaming media file together with audio frames and video frames in the live video stream;
and sending the streaming media file to a server, and enabling a spectator to play the live video and render the augmented reality special effect after receiving the streaming media file through the server.
5. The method of claim 4, wherein the viewer-side playing live video and rendering augmented reality effects after receiving the streaming media file via the server comprises:
Decoding the streaming media file to obtain an audio frame, a video frame and a data frame;
Acquiring a live video picture of the main broadcasting according to the video frame, and identifying the live video picture of the main broadcasting to obtain live scene data;
Predicting a rendering position of the augmented reality effect at a viewer end based on the live scene data and the effect type, wherein the effect type is extracted from the data frame;
And playing live video on the display interface according to the audio frames and the video frames, and rendering the augmented reality special effect at the rendering position of the audience terminal.
6. The method of claim 5, further comprising, after the step of rendering the augmented reality effect at the rendering location of the viewer's end, after the step of playing live video on the display interface according to the audio frames and video frames:
Generating new view and projection matrix according to the play setting parameters of the audience and the video stream picture parameters, and correcting special effect rendering positions by applying the new view and projection matrix, wherein the video stream picture parameters are extracted from the data frames.
7. An augmented reality special effect rendering device, which is applied to a terminal device with a display interface, comprising:
The video analysis module is used for acquiring a live video stream of the anchor, identifying and obtaining human body characteristic data of the anchor according to the live video stream, and extracting picture parameters of the video stream;
The data acquisition module is used for acquiring triggering scene data based on the human body characteristic data and special effect data generated when the augmented reality special effect is triggered, wherein the special effect data at least comprises a special effect type and special effect triggering time, and the triggering scene data is scene data when the special effect is triggered in live broadcast;
The rendering position prediction module is used for inputting the trigger scene data and the human body characteristic data into a preset algorithm to obtain a predicted rendering position weight of the augmented reality special effect; according to the special effect type, adjusting the predicted rendering position weight; determining a rendering position in the live video stream according to the weight;
And the interface rendering module is used for rendering the augmented reality special effect on the display interface according to the human body characteristic data, the video stream picture parameters and the rendering position.
8. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1-6.
9. A terminal device, comprising:
One or more processors;
Storage means for storing one or more programs which when executed by the one or more processors cause the one or more processors to implement the method of any of claims 1-6.
CN202111283354.8A 2021-11-01 2021-11-01 Rendering method, device, medium and equipment for augmented reality special effects Active CN114025219B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111283354.8A CN114025219B (en) 2021-11-01 2021-11-01 Rendering method, device, medium and equipment for augmented reality special effects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111283354.8A CN114025219B (en) 2021-11-01 2021-11-01 Rendering method, device, medium and equipment for augmented reality special effects

Publications (2)

Publication Number Publication Date
CN114025219A CN114025219A (en) 2022-02-08
CN114025219B true CN114025219B (en) 2024-06-04

Family

ID=80059314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111283354.8A Active CN114025219B (en) 2021-11-01 2021-11-01 Rendering method, device, medium and equipment for augmented reality special effects

Country Status (1)

Country Link
CN (1) CN114025219B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114173155B (en) * 2022-02-09 2022-05-10 檀沐信息科技(深圳)有限公司 Virtual live image processing method
CN117994472A (en) * 2022-03-18 2024-05-07 郑州泽正技术服务有限公司 Method and system for carrying out real social contact by utilizing virtual scene and AR glasses
CN114760492B (en) * 2022-04-22 2023-10-20 咪咕视讯科技有限公司 Live special effect generation method, device and system and computer readable storage medium
CN115499678B (en) * 2022-09-20 2024-04-09 广州虎牙科技有限公司 Video live broadcast method and device and live broadcast server
CN115396698A (en) * 2022-10-26 2022-11-25 讯飞幻境(北京)科技有限公司 Video stream display and processing method, client and cloud server
CN115720279B (en) * 2022-11-18 2023-09-15 杭州面朝信息科技有限公司 Method and device for showing arbitrary special effects in live broadcast scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040766A (en) * 2018-08-27 2018-12-18 百度在线网络技术(北京)有限公司 live video processing method, device and storage medium
CN110475150A (en) * 2019-09-11 2019-11-19 广州华多网络科技有限公司 The rendering method and device of virtual present special efficacy, live broadcast system
CN110557649A (en) * 2019-09-12 2019-12-10 广州华多网络科技有限公司 Live broadcast interaction method, live broadcast system, electronic equipment and storage medium
CN112544070A (en) * 2020-03-02 2021-03-23 深圳市大疆创新科技有限公司 Video processing method and device
CN113395533A (en) * 2021-05-24 2021-09-14 广州博冠信息科技有限公司 Virtual gift special effect display method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109040766A (en) * 2018-08-27 2018-12-18 百度在线网络技术(北京)有限公司 live video processing method, device and storage medium
CN110475150A (en) * 2019-09-11 2019-11-19 广州华多网络科技有限公司 The rendering method and device of virtual present special efficacy, live broadcast system
WO2021047420A1 (en) * 2019-09-11 2021-03-18 广州华多网络科技有限公司 Virtual gift special effect rendering method and apparatus, and live streaming system
CN110557649A (en) * 2019-09-12 2019-12-10 广州华多网络科技有限公司 Live broadcast interaction method, live broadcast system, electronic equipment and storage medium
CN112544070A (en) * 2020-03-02 2021-03-23 深圳市大疆创新科技有限公司 Video processing method and device
CN113395533A (en) * 2021-05-24 2021-09-14 广州博冠信息科技有限公司 Virtual gift special effect display method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN114025219A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
CN114025219B (en) Rendering method, device, medium and equipment for augmented reality special effects
US11538229B2 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
CN109345556B (en) Neural network foreground separation for mixed reality
CN110503703B (en) Method and apparatus for generating image
CN110557625A (en) live virtual image broadcasting method, terminal, computer equipment and storage medium
CN111464833B (en) Target image generation method, target image generation device, medium and electronic device
CN111294665B (en) Video generation method and device, electronic equipment and readable storage medium
CN108010037B (en) Image processing method, device and storage medium
CN113099298B (en) Method and device for changing virtual image and terminal equipment
CN110047119B (en) Animation generation method and device comprising dynamic background and electronic equipment
CN113946402A (en) Cloud mobile phone acceleration method, system, equipment and storage medium based on rendering separation
CN109271929B (en) Detection method and device
US11846783B2 (en) Information processing apparatus, information processing method, and program
US9628672B2 (en) Content processing apparatus, content processing method, and storage medium
CN112714337A (en) Video processing method and device, electronic equipment and storage medium
CN116363245A (en) Virtual face generation method, virtual face live broadcast method and device
KR102561903B1 (en) AI-based XR content service method using cloud server
CN110197459A (en) Image stylization generation method, device and electronic equipment
CN113269782B (en) Data generation method and device and electronic equipment
CN112508772B (en) Image generation method, device and storage medium
CN113408452A (en) Expression redirection training method and device, electronic equipment and readable storage medium
CN113486787A (en) Face driving and live broadcasting method and device, computer equipment and storage medium
CN108805951B (en) Projection image processing method, device, terminal and storage medium
CN110662099B (en) Method and device for displaying bullet screen
CN116527956B (en) Virtual object live broadcast method, device and system based on target event triggering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant