WO2020221186A1 - 一种虚拟形象控制方法、装置、电子设备及存储介质 - Google Patents

一种虚拟形象控制方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2020221186A1
WO2020221186A1 PCT/CN2020/087139 CN2020087139W WO2020221186A1 WO 2020221186 A1 WO2020221186 A1 WO 2020221186A1 CN 2020087139 W CN2020087139 W CN 2020087139W WO 2020221186 A1 WO2020221186 A1 WO 2020221186A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
avatar
host
control instruction
machine position
Prior art date
Application number
PCT/CN2020/087139
Other languages
English (en)
French (fr)
Inventor
徐子豪
吴施祈
Original Assignee
广州虎牙信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201910358384.7A external-priority patent/CN110087121B/zh
Priority claimed from CN201910358491.XA external-priority patent/CN110119700B/zh
Application filed by 广州虎牙信息科技有限公司 filed Critical 广州虎牙信息科技有限公司
Priority to SG11202111640RA priority Critical patent/SG11202111640RA/en
Priority to US17/605,476 priority patent/US20220214797A1/en
Publication of WO2020221186A1 publication Critical patent/WO2020221186A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • This application relates to the field of live broadcast technology, and specifically, provides a virtual image control method, device, electronic equipment, and storage medium.
  • a virtual image can be used to replace the actual image of the host to be displayed in the live screen.
  • the control accuracy of the avatar is generally low, which leads to the problem of insufficient interest in the scheme of combining the avatar to live broadcast.
  • the purpose of this application is to provide an avatar control method, device, electronic equipment and storage medium, which can display avatars under different camera positions, thereby creating the effect of stage performances and improving the user experience when combining avatars in live broadcast .
  • An embodiment of the application provides a method for controlling an avatar, the method including:
  • the virtual image is controlled according to the virtual machine position control instruction and the action control instruction.
  • the step of determining whether to obtain a virtual machine position control instruction corresponding to the host includes:
  • the step of judging whether a virtual machine position control instruction corresponding to the host sent by the live broadcast receiving end is received includes:
  • the virtual machine position operation instruction Upon receiving the virtual machine position operation instruction sent by the live broadcast receiving end, it is determined whether the virtual machine position operation instruction meets the first preset condition, wherein the first preset condition is based on the corresponding User history data determination;
  • the step of determining whether to obtain a virtual machine position control instruction corresponding to the host includes:
  • the step of determining whether to obtain the virtual machine position control instruction generated based on the information corresponding to the host includes:
  • the step of determining whether to obtain a virtual machine position control instruction generated based on operation information corresponding to the host includes:
  • the first preset information includes keyword information and/or melody feature information.
  • the step of determining whether to obtain a virtual machine position control instruction generated based on the information corresponding to the host includes:
  • the step of judging whether to obtain a virtual machine position control instruction generated based on information corresponding to the host based on a result obtained by analyzing the host video frame includes:
  • the second preset information includes action information, depth information, identification object information, and/or identification color information.
  • the step of determining whether to obtain a virtual machine position control instruction corresponding to the host includes:
  • the step of controlling the avatar according to the virtual machine position control instruction and the action control instruction includes:
  • the display size and/or display angle of the virtual image in the live screen are controlled according to the virtual camera position control instruction.
  • the virtual machine position operation instruction includes an angle parameter
  • the step of controlling the display size and/or display angle of the avatar in the live screen according to the virtual machine position control instruction includes:
  • Control the live screen to stop displaying the host video frame, and obtain part of the three-dimensional view angle data corresponding to the angle parameter in the three-dimensional image data constructed in advance for the avatar.
  • the virtual machine position operation instruction includes angle information
  • the step of controlling the display size and/or display angle of the avatar in the live screen according to the virtual machine position control instruction includes:
  • Control the live screen to stop displaying the host video frame adjust the three-dimensional image data constructed in advance for the avatar according to the host video frame, and obtain the part corresponding to the angle parameter in the adjusted three-dimensional image data Three-dimensional perspective data.
  • the step of adjusting the three-dimensional image data constructed in advance for the avatar according to the host video frame includes:
  • the three-dimensional image data constructed in advance for the virtual image is adjusted according to the coordinate information.
  • the virtual machine bit operation instruction includes a scaling parameter
  • the step of controlling the display size and/or display angle of the avatar in the live screen according to the virtual machine position control instruction includes:
  • the display size of the avatar in the live screen is determined according to the zoom parameter and the initial size of the avatar.
  • the amount of data when displaying the avatar based on the display angle is determined according to the display times corresponding to each display angle.
  • the step of analyzing the host video frame sent by the live broadcast initiator to generate an action control instruction includes:
  • Extract the current video frame in the host video frame sent by the live broadcast initiator every preset period perform image analysis on the current video frame, and generate an action control instruction according to the image analysis result of the current video frame.
  • An embodiment of the application also provides a virtual image control device, the device includes:
  • the control instruction generating module is configured to analyze the host video frame sent by the live broadcast initiator to generate an action control instruction; wherein the host video frame is obtained by the live broadcast initiator shooting the host, and the action control instruction is Configure to control the avatar in the live screen of the live broadcast receiver;
  • a control instruction judgment module configured to judge whether to obtain a virtual machine position control instruction corresponding to the host
  • the avatar control module is configured to control the avatar according to the virtual machine position control instruction and the action control instruction when the virtual machine position control instruction is obtained.
  • An embodiment of the present application also provides an electronic device including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and the computer program implements the aforementioned avatar control method when running on the processor .
  • the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed, the aforementioned avatar control method is implemented.
  • FIG. 1 is a schematic block diagram of an electronic device provided by an embodiment of the application.
  • FIG. 2 is a schematic flowchart of a method for controlling an avatar provided by an embodiment of the application.
  • Fig. 3 is a system block diagram of a live broadcast system provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of the effect of controlling an avatar based on zoom parameters provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of another effect of controlling an avatar based on zoom parameters according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of the effect of controlling an avatar based on an angle parameter provided by an embodiment of the application.
  • FIG. 7 is a schematic diagram of controlling an avatar based on feature points according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of the correspondence between the number of feature points and the number of display times provided by an embodiment of the application.
  • FIG. 9 is a block diagram of the functional modules included in the avatar control device provided by an embodiment of the application.
  • Icons 100-electronic equipment; 102-memory; 104-processor; 106-virtual image control device; 106a-control instruction generation module; 106b-control instruction judgment module; 106c-virtual image control module.
  • an embodiment of the present application provides an electronic device 100.
  • the electronic device 100 may be used as a live broadcast device.
  • the electronic device 100 may be a background server that communicates with a terminal device used by the host during the live broadcast.
  • the electronic device 100 may include a memory 102, a processor 104, and an avatar control device 106.
  • the memory 102 and the processor 104 may be directly or indirectly electrically connected to implement data transmission or interaction.
  • the memory 102 and the processor 104 may be electrically connected through one or more communication buses or signal lines.
  • the avatar control device 106 may include at least one software function module that may be stored in the memory 102 in the form of software or firmware.
  • the processor 104 may be configured to execute executable computer programs stored in the memory 102, for example, software function modules and computer programs included in the avatar control device 106, to perform high-precision image processing on the avatar in the live screen. control.
  • the memory 102 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), and programmable read-only memory (Programmable Read Only). -Only Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), Electric Erasable Programmable Read-Only Memory, EEPROM, etc.
  • the processor 104 may be an integrated circuit chip with signal processing capability.
  • the aforementioned processor 104 may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), a system on chip (System on Chip, SoC), etc.; it may also be a digital signal processing unit. DSP, application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • CPU central processing unit
  • NP network processor
  • SoC System on Chip
  • DSP digital signal processing unit
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the structure shown in FIG. 1 is only for illustration, and the electronic device 100 may also include more or fewer components than the structure shown in FIG. 1, or have a configuration different from the structure shown in FIG. 1, for example, also It may include a communication unit configured to exchange information with other live broadcast equipment (such as terminal equipment used by the host, terminal equipment used by the audience, etc.).
  • live broadcast equipment such as terminal equipment used by the host, terminal equipment used by the audience, etc.
  • an embodiment of the present application also provides a virtual image control method applicable to the above-mentioned electronic device 100.
  • the method steps defined in the process related to the avatar control method can be implemented by the electronic device 100. The specific process shown in FIG. 2 will be described in detail below.
  • Step 201 Analyze the host video frame sent by the live broadcast initiator to generate an action control instruction.
  • the live broadcast initiating terminal may take a picture of the host that is currently live on the web to obtain the host video frame corresponding to the host, and send the host video frame to the electronic device 100.
  • the electronic device 100 can receive the host video frame sent by the live broadcast initiator, perform analysis processing (such as image analysis) on the host video frame, and generate an action control instruction based on the analysis result.
  • the action control instruction can be configured to The avatar in the live screen of the receiving end is controlled.
  • Step 203 Determine whether a virtual machine position control instruction corresponding to the host is obtained.
  • step 201 after the electronic device 100 generates the action control instruction in step 201, it can also determine whether to obtain the virtual machine position control instruction corresponding to the host. Furthermore, when it is determined that the virtual machine position control instruction is obtained, step 205 may be executed.
  • Step 205 Control the avatar according to the virtual machine position control instruction and the action control instruction.
  • the electronic device 100 when the electronic device 100 determines through step 203 that the virtual machine position control instruction corresponding to the host is obtained, it may control the avatar based on the virtual machine position control instruction and the action control instruction. In other words, the electronic device 100 can control the avatar based on the action control instruction and combine the virtual machine position control instruction to control the avatar together, thereby improving the accuracy of control.
  • the display of the avatar can also show the state of different camera positions, thereby creating a stage performance effect in the live broadcast room, making the live broadcast more perceptive and improving the virtual reality
  • the fun of image display enhances user experience.
  • the embodiment of the present application does not limit the manner in which the electronic device 100 obtains the anchor video frame.
  • the electronic device 100 may be a back-end server, the back-end server is communicatively connected to a first terminal, and the first terminal may also be communicatively connected to an image acquisition device (such as a camera) .
  • the first terminal can be a terminal device used by the host during the live broadcast (such as mobile phones, tablets, computers, etc.), and the image acquisition device can be configured to collect the images of the host during the live broadcast, so as to obtain the host video frame and send the host The video frame is sent to the background server through the first terminal.
  • the above-mentioned image acquisition device can be used as a separate device or integrated with the first terminal; for example, in some possible implementations, the image acquisition device can be a mobile phone, a tablet computer, Cameras carried by terminal devices such as computers.
  • the embodiment of the present application does not limit the manner in which the electronic device 100 performs step 201 to analyze the host video frame.
  • the electronic device 100 may randomly extract video frames from the host video frame, and generate corresponding action control instructions based on the extracted video frames.
  • the electronic device may extract the current video frame in the host video frame sent by the live broadcast initiator every preset period, and perform image analysis on the current video frame , And generate action control instructions based on the image analysis result of the current video frame.
  • the electronic device 100 may extract a video frame from the host video frame (that is, the current host video frame) every preset period;
  • the video frame is subjected to image analysis processing (such as feature extraction, etc.); finally, corresponding action control instructions can be generated based on the result of the analysis processing.
  • the electronic device extracts video frames according to a certain period, it can reflect the actual actions of the host to a greater extent when controlling the actions of the avatar based on the action control instructions generated by the extracted video frames. It can also reduce the amount of data processing, ease the pressure on the corresponding processor, and make the live broadcast better in real time.
  • the embodiment of the present application does not limit the execution strategy of the foregoing preset period.
  • the preset period may be a preset duration (such as 0.1s, 0.2s, 0.3s, etc.), that is, ,
  • the video frame extraction operation can be performed every interval of the preset duration to obtain a video frame; it can also be the preset number of frames (1 frame, 2 frames, 3 frames, etc.), that is, the preset frame can be every interval Perform a video frame extraction operation to get a video frame.
  • the electronic device 100 may also perform image analysis on each host video frame in the host video frame sent by the live broadcast initiator, and perform image analysis according to each host video frame.
  • the result of image analysis generates motion control instructions.
  • the electronic device 100 can extract each anchor video frame; then, perform image analysis processing (such as feature extraction, etc.) on each extracted anchor video frame; and finally , Can generate corresponding action control commands based on the image analysis results of each host video frame.
  • image analysis processing such as feature extraction, etc.
  • the electronic device since the electronic device generates corresponding action control instructions according to each host video frame, when controlling the avatar based on the action control instructions, the actions of the avatar can completely reflect the real actions of the host, so that the avatar The display is more agile and the connection between actions is smoother to improve the viewer’s viewing experience.
  • the trained neural network can be used to identify the anchor video frame to obtain the anchor's action posture in the anchor video frame, and Based on this action posture, an action control command is generated.
  • the electronic device 100 when the electronic device 100 performs step 203, it can determine whether it has received a virtual machine position control instruction corresponding to the host sent by the live receiving end; that is, the live receiving end
  • the virtual camera position control instruction may be directly sent to the electronic device 100, so that the electronic device 100 executes step 205 according to the received virtual camera position control instruction, so that the live broadcast receiver can control the avatar corresponding to the host.
  • the electronic device 100 may determine the first preset condition based on the user history data corresponding to the live broadcast receiving end , Determine whether the virtual machine position operation instruction meets the first preset condition; if the virtual machine position operation instruction meets the first preset condition, the electronic device 100 may determine that the virtual machine position operation instruction is obtained.
  • the electronic device 100 will detect whether the virtual machine position operation instruction sent by the live broadcast receiver is received; then, when the virtual machine position operation instruction is received, it will determine whether the virtual machine position operation instruction conforms to the user history data The determined first preset condition; finally, only when the virtual machine position operation instruction meets the first preset condition, the virtual machine position operation instruction is determined to be obtained.
  • the specific content of the aforementioned user history data may include, but is not limited to, the user's level, the duration of watching the live broadcast, the number of barrage sent, the number or value of gifts, etc.
  • a certain level such as level 10, level 15, etc.
  • the electronic device 100 can also make a more accurate determination when determining whether to obtain a virtual machine position operation instruction based on user historical data. For example, based on different user historical data, it can be determined that the types of virtual machine operation instructions that can be obtained are also different.
  • the user history data is the user level as an example for description. It is assumed that the virtual machine position operation instructions include five types, namely, the first operation instruction, the second operation instruction, the third operation instruction, the fourth operation operation, and the fifth operation instruction. If the user's level belongs to the interval [0, 5], then only when the first operation instruction is received can the operation instruction be determined; if the user's level belongs to the interval (5, 10], then only after the first operation instruction is received Only one operation instruction or second operation instruction can be determined to obtain the operation instruction; and so on, if the user's level belongs to the interval (20, + ⁇ ), any one of the five virtual machine operation instructions is received When operating an instruction, it can be determined to obtain the operating instruction.
  • the virtual machine position operation instructions include five types, namely, the first operation instruction, the second operation instruction, the third operation instruction, the fourth operation operation, and the fifth operation instruction.
  • the electronic device 100 when it performs step 203, it may also use methods such as information extraction to determine whether to obtain a virtual machine position control instruction generated based on the information corresponding to the host; that is, In other words, the live broadcast initiator may not directly send the virtual machine position control instruction to the electronic device 100, but the electronic device extracts and generates based on the information corresponding to the host, thereby executing step 205 according to the extracted virtual machine position control instruction.
  • the method of determining may also be different according to the generation method of the virtual machine position control instruction.
  • the virtual machine position control instruction may be generated based on operation information corresponding to the host.
  • the aforementioned first terminal may generate a corresponding virtual machine position control instruction in response to an operation of the anchor, and send the virtual machine position control instruction to the aforementioned background server.
  • the background server may determine to obtain the virtual machine position control instruction when receiving the virtual machine position control instruction.
  • the mode of operation of the anchor on the first terminal is not limited, and may include, but is not limited to, the anchor on the buttons on the first terminal (such as physical buttons or virtual buttons on the screen), keyboard , Mouse, microphone and other input devices.
  • the host can input a text message through the keyboard or a voice message through the microphone (such as "enlarge 2 times" or “show back", etc., or it can be some simple numbers or words, such as "1".
  • the electronic device 100 when the electronic device 100 receives the voice information generated based on the operation information corresponding to the host (operating the first terminal device through the microphone), it can determine whether the voice information contains First preset information, and when the first preset information is available, it is determined to acquire a virtual machine position control instruction generated based on the operation information corresponding to the host.
  • the aforementioned first preset information may be keyword information or other information.
  • the voice information is a song (for example, played by a device or sung by a host)
  • the aforementioned first preset information may also be melody feature information. That is, the electronic device 100 can use the trained neural network to recognize the melody feature of the voice information sent by the first terminal, and determine the virtual machine position control instruction according to the melody feature obtained by the recognition. For example, in a soft melody, the electronic device 100 can generate a control command to move the overhead camera away. In the climax or the chorus melody, the electronic device 100 may generate a control instruction for zooming in on the face.
  • the virtual machine position control instruction may also be generated by the electronic device 100 based on the result obtained by analyzing the host video frame during step 201.
  • the electronic device 100 may also determine whether to obtain the virtual machine position control instruction generated based on the information corresponding to the host based on the result of analyzing the host video frame sent by the live broadcast initiator.
  • the electronic device 100 may extract information from the anchor video frame to determine whether the obtained image information has second preset information, and when the second preset information is provided, the electronic device 100 may be based on the second preset information. Second, the preset information generates a corresponding virtual machine position control instruction, and determines that the virtual machine position control instruction is acquired.
  • the embodiment of the present application does not limit the specific content of the aforementioned second preset information.
  • the second preset information may include, but is not limited to, action information, depth information, or other information.
  • the aforementioned second preset information may be action information.
  • the electronic device 100 can generate corresponding virtual machine position control instructions based on a specific action of the host. For example, when the host reaches out his left hand, it can generate a control to display the left side of the avatar. Instructions; when the host stretches out his right hand, he can generate a control command to display the right side of the avatar; when the host touches the left hand with his right hand, he can generate a control command to display the back of the avatar; when the host is squatting, he can generate a display Control instructions at the top of the head of the avatar.
  • the above-mentioned other information may be information such as an identification object or an identification color.
  • the anchor can carry an identification object or wear clothing or accessories with an identification color, so that when the electronic device 100 performs step 203, it can obtain a virtual machine position control instruction by identifying the identification object or the identification color.
  • a control command of the camera position can be generated asymptotically. That is to say, when the anchor carries a variety of different sizes of identification objects, or wears a variety of clothes or accessories with different colors, so that the anchor has different actions at different times, so that when the electronic device 100 is
  • the control avatar can display a stage effect from a distant scene to a close scene or from a distant scene to a close scene according to the identified identification objects or different identification colors.
  • the electronic device 100 may also determine whether to obtain a virtual machine position control instruction based on the historical live broadcast data corresponding to the host when performing step 203.
  • the electronic device 100 may also determine whether the virtual machine position control instruction meets the second preset condition according to the second preset condition determined based on the user history data corresponding to the host. Moreover, only when the virtual machine position control instruction meets the second preset condition, can it be determined to obtain the virtual machine position control instruction.
  • the aforementioned historical live broadcast data corresponding to the host may be the level of the host, and the higher the level, the greater the number of virtual machine position control instructions that can be determined and obtained. For example, if the host’s level is less than 5, it can be determined that no virtual machine location control instructions can be obtained; if the host’s level is greater than or equal to level 5, and less than or equal to level 10, it can be determined that some virtual machine location control instructions can be obtained; If the host's level is greater than 10, it can be determined that any virtual machine position control instructions can be obtained.
  • whether to obtain the virtual machine position control instruction is determined according to a certain level range. In some other examples, it can also be determined for each level to obtain different virtual machine position control. instruction.
  • the aforementioned historical live broadcast data corresponding to the host may also include the number or value of gifts received by the host during the live broadcast, and the amount of barrage of the audience during the live broadcast. And the maximum number of viewers that the host can watch during the live broadcast. For example, the greater the number or value of gifts received, the greater the amount of barrage, or the greater the number of maximum audiences, the more virtual machine position control instructions that can be determined to be obtained.
  • step 205 may be executed.
  • the specific processing method is not limited; for example, in a possible implementation manner, the electronic device 100 can control the virtual image according to the action control instruction.
  • the avatar is controlled according to the virtual machine position control instruction and the action control instruction; if the electronic device 100 does not obtain the virtual machine position control instruction , The avatar is controlled only according to the action control instructions.
  • the embodiment of the present application does not limit the manner in which the electronic device 100 executes step 205, and can be selected according to actual application requirements, such as the performance of the processor 104, the control accuracy of the avatar, and so on.
  • the electronic device 100 performs step 205 in the following manner: according to the action control instruction to control the display posture of the avatar in the live screen; according to the virtual machine position control instruction to control the avatar in the live broadcast
  • the display size in the screen is controlled, or the display angle of the avatar in the live screen is controlled, or the display size and display angle of the avatar in the live screen are controlled.
  • the electronic device 100 can control the display posture of the avatar according to the action control instruction; on the other hand, on the basis of controlling the display posture of the avatar, the electronic device 100 can also be based on the obtained virtual image.
  • the camera position control command controls the display size of the avatar in the display posture, or controls the display angle of the avatar in the display posture, or controls the display size and size of the avatar in the display posture.
  • the display angle is controlled.
  • the electronic device 100 may control the avatar to dance based on the motion control instruction.
  • the electronic device 100 obtains the virtual camera position control instruction, it can control the different display sizes of the avatar in the dancing state according to the virtual camera position control instruction, or display different display sizes of the avatar in the dancing state Angle control, or control the different display sizes and different display angles of the avatar in the dancing state.
  • the aforementioned display gestures may include, but are not limited to, actions such as kicking, clapping, bending over, shaking shoulders, shaking head, etc., as well as expressions such as frowning, laughing, smiling, and glaring.
  • the method of controlling the avatar in the embodiment of the present application is also not limited.
  • the electronic device 100 may be controlled based on predetermined feature points.
  • the electronic device 100 may also perform corresponding control on the virtual image based on the information carried by the virtual machine position operation instruction.
  • the user can perform different operations on the live broadcast receiving end, so that the live broadcast receiving end can generate virtual machine position operation instructions carrying different information based on different operations.
  • the manner in which the user operates the live broadcast receiving terminal is not limited.
  • the operation manner may include the user's operation on input devices such as a touch screen, a mouse, a keyboard, and a microphone.
  • the embodiment of the present application does not limit the information carried in the virtual machine position operation instruction, and can be selected according to actual application requirements.
  • a scaling parameter may be included in the virtual machine bit operation instruction. That is, when the electronic device 100 performs step 205, the display size of the avatar displayed in the live screen of the live broadcast receiving end can be controlled according to the zoom parameter and the initial size of the avatar in the host video frame.
  • the way in which the electronic device 100 controls the display size of the avatar according to the zoom parameter may also be different.
  • the electronic device 100 controls the avatar to enlarge a specific multiple (such as 2 times, 3 times) on the basis of the initial size. Or 5 times, etc.) or reduce a specific multiple (such as 0.2 times, 0.5 times or 0.8 times, etc.).
  • a specific multiple such as 2 times, 3 times
  • a specific multiple such as 0.2 times, 0.5 times or 0.8 times, etc.
  • the electronic device 100 can control the avatar to zoom in or out by different multiples based on the initial size according to the specific value of the zoom parameter in the virtual machine position operation instruction.
  • the avatar when the zoom parameter is 2, the avatar can be controlled to be enlarged by 2 times on the basis of the initial size; as shown in Figure 5, when the zoom parameter is 0.5, the avatar can be controlled on the basis of the initial size Reduced by 0.5 times).
  • the virtual machine bit operation instruction may include an angle parameter.
  • the electronic device 100 executes step 205, it can control the display angle of the avatar displayed in the live screen of the live broadcast receiving end according to the angle parameter.
  • the electronic device 100 may also control the display angle of the avatar according to the angle parameter in different ways.
  • the electronic device 100 can control the avatar at a specific angle (such as the back, left, or right) Show it under.
  • the electronic device 100 may control the avatar to display at the corresponding angle according to the specific value of the angle parameter in the virtual machine position operation instruction. As shown in Figure 6, when the angle parameter is 180°, the electronic device 100 can control the back of the avatar display; when the angle parameter is 90°, the electronic device 100 can control the left side of the avatar display; when the angle parameter is 270° When the time, the electronic device 100 can control the avatar to display the right side.
  • the operation mode of the electronic device 100 when controlling the avatar based on the angle parameter may also be different.
  • the electronic device 100 when the electronic device 100 controls the avatar according to the angle parameter, it can control the live screen of the live receiving end to stop displaying the host video frame, and obtain the three-dimensional image data constructed in advance for the avatar. Part of the 3D viewing angle data corresponding to the angle parameter.
  • the live receiving end will generate a corresponding virtual machine operation instruction and send it to the electronic device 100 (back-end server) ,
  • the electronic device 10 can stop sending the anchor video frame to the live broadcast receiving end based on the virtual machine position operation instruction, so as to control the live screen of the live broadcast receiving end to stop displaying the anchor video frame.
  • the electronic device 100 may obtain the corresponding part of the three-dimensional viewing angle data from the three-dimensional image data constructed in advance for the virtual image according to the angle parameter in the virtual machine position instruction. For example, if the angle parameter is 90°, the electronic device 100 can obtain part of the three-dimensional view data corresponding to the left side in the three-dimensional image data; if the angle parameter is 180°, the electronic device 100 can obtain the back side corresponding to the three-dimensional image data. Part of the 3D perspective data. Finally, the electronic device 100 sends part of the acquired three-dimensional view data to the live broadcast receiving end for visualization processing, so as to complete the control of the avatar. In this way, part of the three-dimensional view angle data corresponding to the angle parameters can be obtained quickly, so that the amount of data processing is small, and it can effectively ensure that the live broadcast has high real-time performance.
  • the electronic device 100 when the electronic device 100 controls the avatar according to the angle parameter, it can control the live screen of the live receiving end to stop displaying the anchor video frame, and according to the anchor video frame pair, the virtual The three-dimensional image data constructed by the image is adjusted, and part of the three-dimensional perspective data corresponding to the angle parameter in the adjusted three-dimensional image data is obtained.
  • the live receiving end will generate a corresponding virtual machine operation instruction and send it to the electronic device 100 (back-end server)
  • the electronic device 100 may stop sending the host video frame to the live broadcast receiving end based on the virtual machine position operation instruction, so as to control the live screen of the live broadcast receiving end to stop displaying the host video frame.
  • the electronic device 100 may adjust the three-dimensional image data constructed in advance for the avatar according to the host video frame to obtain new three-dimensional image data. Then, part of the three-dimensional view angle data corresponding to the angle parameter is obtained from the new three-dimensional image data. For example, if the angle parameter is 90°, the electronic device 100 can obtain part of the three-dimensional view data corresponding to the left side in the new three-dimensional image data; if the angle parameter is 180°, the electronic device 100 can obtain the new three-dimensional image data Part of the 3D perspective data corresponding to the middle and back. Finally, the electronic device 100 may send part of the acquired three-dimensional view data to the live broadcast receiving end for visualization processing, so as to complete the control of the avatar. In this way, part of the acquired three-dimensional perspective data can reflect the actual actions of the anchor to a greater extent, so that the avatar has a higher degree of fidelity when displaying different angles, thereby improving the user experience.
  • the embodiment of the present application does not limit the manner in which the electronic device 100 adjusts the three-dimensional image data according to the host video frame.
  • the electronic device 100 can adjust the three-dimensional image data in the following manner: the electronic device 100 can obtain the coordinate information of the target feature point in the host video frame, and calculate the virtual data based on the coordinate information. Coordinate information of other characteristic points of the image; then adjust the three-dimensional image data constructed in advance for the virtual image according to the coordinate information.
  • the electronic device 100 can obtain the coordinate information (three-dimensional coordinates) of each target feature point in the host video frame (each target feature point on the front of the avatar, such as corresponding feature points such as eyes, nose, mouth, ears, etc.) With depth information); then, the electronic device 100 can calculate the coordinates of other feature points of the avatar (feature points other than the target feature points in the three-dimensional model of the avatar, such as the feature points that can be seen only on the back) based on the obtained coordinate information. Information; Finally, the electronic device 100 can adjust part of the pre-built three-dimensional image data corresponding to other feature points based on the coordinate information of other feature points to obtain new three-dimensional image data.
  • the algorithm for calculating the coordinate information of other feature points based on the coordinate information of the target feature point may be a reverse (reverse) motion algorithm.
  • the video frame played by the live broadcast receiving end is: based on the partial 3D perspective data in the 3D image data (the front part) ) Adjusted data.
  • the data adjustment has been completed; therefore, in the aforementioned solution provided by the embodiment of the present application, only part of the data corresponding to other feature points needs to be adjusted.
  • the amount of data displayed when the avatar is displayed can also be controlled according to actual application requirements. For example, when the real-time requirements are high, lower data can be displayed. When the control accuracy is higher, higher data can be displayed.
  • the electronic device 100 In a possible implementation manner, in order to ensure a higher control accuracy to ensure user experience, the amount of data processing can also be lower, so that the live broadcast of the avatar can be better real-time, the electronic device 100
  • the amount of data displayed when the avatar is displayed can be determined by the following steps: Obtain the number of times the avatar is displayed at the receiving end of the live broadcast based on each display angle; Determine the data when the avatar is displayed based on the display angle according to the number of display times corresponding to each display angle the amount.
  • the electronic device 100 can obtain the display times of the avatar corresponding to each display angle in all live broadcast times or in a relatively recent live broadcast time. For example, assume that the display angle is 90° in the most recent month. (Left side) corresponds to 3000 times of display, display angle of 180° (rear) corresponds to 7000 times, and display angle of 270° (right side) corresponds to 2000 times.
  • the electronic device 100 can determine the data amount of the corresponding display angle based on the acquired display times. For example, if the number of displays is greater, the greater the amount of data displayed when the corresponding display angle can be controlled. In this way, in the above example, since the display angle is 180° (back side), the number of display times (7000 times) is the largest, and the electronic device 100 can control the maximum amount of data when displaying based on the display angle; because the display angle is 270 ° (right side), the number of display times (2000 times) is the smallest, and the amount of data that the electronic device 100 can control to display based on the display angle is also the smallest.
  • the aforementioned amount of data may refer to the number of feature points.
  • the number of feature points when the avatar is displayed based on the display angle can be determined according to the display times corresponding to each display angle (as shown in FIG. 7).
  • the number of display times is 7000, correspondingly, the number of feature points that can be controlled can be 300; when the display angle is 90° (left side), The number of display times is 3000, correspondingly, the number of feature points that can be controlled can be 200; when the display angle is 270° (right side), the number of display times is 2000, correspondingly, the number of feature points that can be controlled can be For 150.
  • the electronic device 100 may pre-establish the corresponding relationship between the number of feature points and the number of display times, so that after the number of display times is obtained, the number of feature points can be obtained directly according to the corresponding relationship.
  • the corresponding relationship may be: the greater the number of display times, the greater the number of corresponding feature points.
  • an embodiment of the present application also provides an avatar control apparatus 106 that can be applied to the above-mentioned electronic device 100.
  • the avatar control device 106 may include a control instruction generation module 106a, a control instruction judgment module 106b, and an avatar control module 106c.
  • the control instruction generating module 106a can be configured to analyze the host video frame sent by the live broadcast initiator to generate an action control instruction; wherein the host video frame is obtained by the live broadcast initiator shooting the host, and the action control instruction is configured for the live broadcast
  • the avatar in the live screen of the receiving end is controlled; in one embodiment, the control instruction generation module 106a can execute step 201 shown in FIG. 2.
  • the control instruction generation module 106a can execute step 201 shown in FIG. 2.
  • the related content of the control instruction generation module 106a refer to the previous comparison in this embodiment of the application. Description of step 201.
  • the control instruction judging module 106b can be configured to judge whether to obtain the virtual machine position control instruction corresponding to the host; in one embodiment, the control instruction judging module 106b can execute step 203 shown in FIG. 2, regarding the control instruction judging module 106b For related content, refer to the foregoing description of step 203 in the embodiment of the present application.
  • the avatar control module 106c can be configured to control the avatar according to the virtual machine position control instruction and the action control instruction when obtaining the virtual machine position control instruction; in one embodiment, the avatar control module 106c can execute FIG. 2 As shown in step 205, for related content of the avatar control module 106c, reference may be made to the foregoing description of step 205 in the embodiment of the present application.
  • the avatar control module 106c may also be configured to control the avatar according to the action control instruction.
  • a computer-readable storage medium stores a computer program that executes the aforementioned avatar control method when the computer program runs. The various steps.
  • the avatar control method, device, electronic equipment, and storage medium provided by this application are based on the host video frame sent by the live broadcast initiator to control the avatar, if the host also obtains the corresponding virtual machine position control instruction , You can also control the avatar in combination with the virtual camera position control instructions to display the avatars under different camera positions, thereby creating the effect of stage performances, thereby increasing the interest of the avatar display, and enhancing the avatar live broadcast process User experience.
  • the avatar Since the avatar is controlled based on the host video frame sent by the live broadcast initiator, if the host's corresponding virtual machine position control instruction is also obtained, the avatar can also be controlled in combination with the virtual machine position control instruction to display
  • the avatars in different positions can create the effect of stage performance, thereby enhancing the interest of avatar display and enhancing the user experience in the process of avatar live broadcast.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请实施例提供了一种虚拟形象控制方法、装置、电子设备及存储介质,涉及直播技术领域;其中,该虚拟形象控制方法包括:对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,主播视频帧由直播发起端对主播进行拍摄得到,动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;判断是否获得主播对应的虚拟机位控制指令;若获得虚拟机位控制指令,则根据该虚拟机位控制指令和动作控制指令对虚拟形象进行控制;如此,能够展示不同机位下的虚拟形象,从而营造出舞台表演的效果,进而提高虚拟形象展示的趣味性,提升虚拟形象直播过程中的用户体验。

Description

一种虚拟形象控制方法、装置、电子设备及存储介质
相关申请的交叉引用
本申请要求于2019年4月30日提交中国专利局的申请号为201910358491X、名称为“虚拟形象控制方法、虚拟形象控制装置和电子设备”的中国专利申请的优先权,以及要求于2019年4月30日提交中国专利局的申请号为2019103583847、名称为“虚拟形象显示方法、虚拟形象显示装置和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及直播技术领域,具体而言,提供一种虚拟形象控制方法、装置、电子设备及存储介质。
背景技术
在例如网络直播等场景中,为了提高直播的趣味性,可以采用虚拟形象替代主播的实际形象在直播画面中进行展示。但是,一些常见的直播技术中对虚拟形象的控制精度一般较低,从而导致结合虚拟形象进行直播的方案存在趣味性不足的问题。
发明内容
本申请的目的在于提供一种虚拟形象控制方法、装置、电子设备及存储介质,能够展示出不同机位下的虚拟形象,从而营造出舞台表演的效果,提升结合虚拟形象进行直播时的用户体验。
为实现上述目的中的至少一个目的,本申请采用的技术方案如下:
本申请实施例提供了一种虚拟形象控制方法,所述方法包括:
对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,所述主播视频帧由所述直播发起端对主播进行拍摄得到,所述动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;
判断是否获得所述主播对应的虚拟机位控制指令;
若获得所述虚拟机位控制指令,则根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制。
可选地,作为一种可能的实现方式,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
判断是否接收到所述直播接收端发送的与所述主播对应的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述判断是否接收到所述直播接收端发送的与所述主播对应的虚拟机位控制指令的步骤,包括:
在接收到所述直播接收端发送的虚拟机位操作指令时,判断所述虚拟机位操作指令是否符合第一预设条件,其中,所述第一预设条件基于所述直播接收端对应的用户历史数据确定;
若所述虚拟机位操作指令符合所述第一预设条件,则判定获得所述虚拟机位操作指令。
可选地,作为一种可能的实现方式,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述判断是否获得基于所述主播对应的信息生成的所述虚拟机位控制指令的步骤,包括:
判断是否获得基于所述主播对应的操作信息生成的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述判断是否获得基于所述主播对应的操作信息生成的虚拟机位控制指令的步骤,包括:
在接收到基于所述主播对应的操作信息生成的语音信息时,判断该语音信息中是否具有第一预设信息,并在具有该第一预设信息时,判定获取基于所述主播对应的操作信息生成的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述第一预设信息包括关键词信息和/或旋律特征信息。
可选地,作为一种可能的实现方式,所述判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令的步骤,包括:
基于对所述主播视频帧进行分析得到的结果,判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述基于对所述主播视频帧进行分析得到的结果,判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令的步骤,包括:
基于对所述主播视频帧进行信息提取得到的图像信息,判断该图像信息中是否具有第二预设信息,并在具有该第二预设信息时,判定获得基于所述主播对应的信息生成的虚拟机位控制指令。
可选地,作为一种可能的实现方式,所述第二预设信息包括动作信息、深度信息、标识物件信息和/或标识颜色信息。
可选地,作为一种可能的实现方式,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
在接收到所述直播接收端发送的虚拟机位操作指令时,判断所述虚拟机位操作指令是否符合第二预设条件;其中,所述第二预设条件基于所述主播对应的用户历史数据确定;
若所述虚拟机位操作指令符合基于所述第二预设条件,则判定获得所述虚拟机位操作指令。
可选地,作为一种可能的实现方式,所述根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制的步骤,包括:
根据所述动作控制指令对所述虚拟形象在所述直播画面中的展示姿态进行控制;
根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制。
可选地,作为一种可能的实现方式,所述虚拟机位操作指令中包括角度参数;
所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
控制所述直播画面停止显示所述主播视频帧,并获取预先针对所述虚拟形象构建的三维图像数据中所述角度参数对应的部分三维视角数据。
可选地,作为一种可能的实现方式,所述虚拟机位操作指令中包括角度信息;
所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
控制所述直播画面停止显示所述主播视频帧,根据该主播视频帧对预先针对所述虚拟形象构建的三维图像数据进行调整,并获取在调整后的三维图像数据中所述角度参数对应的部分三维视角数据。
可选地,作为一种可能的实现方式,所述根据该主播视频帧对预先针对所述虚拟形象构建的三维图像数据进行调整的步骤,包括:
获取所述主播视频帧中的目标特征点的坐标信息,并基于该坐标信息计算得到虚拟形象的其它特征点的坐标信息;
根据所述坐标信息对预先针对所述虚拟形象构建的三维图像数据进行调整。
可选地,作为一种可能的实现方式,所述虚拟机位操作指令中包括缩放参数;
所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
根据所述缩放参数和所述虚拟形象的初始大小确定在所述直播画面中虚拟形象的展示大小。
可选地,作为一种可能的实现方式,还包括:
获取所述虚拟形象基于各展示角度在所述直播接收端的显示次数;
根据各展示角度对应的显示次数确定在基于该展示角度对所述虚拟形象进行显示时的数据量。
可选地,作为一种可能的实现方式,所述对直播发起端发送的主播视频帧进行分析,生成动作控制指令的步骤,包括:
对直播发起端发送的每一主播视频帧进行图像分析,并根据每一主播视频帧的图像分析结果生成动作控制指令;或
每隔预设周期提取直播发起端发送的主播视频帧中的当前视频帧,对该当前视频帧进行图像分析,并根据对该当前视频帧的图像分析结果生成动作控制指令。
本申请实施例还提供一种虚拟形象控制装置,所述装置包括:
控制指令生成模块,被配置成对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,所述主播视频帧由所述直播发起端对主播进行拍摄得到,所述动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;
控制指令判断模块,被配置成判断是否获得所述主播对应的虚拟机位控制指令;
虚拟形象控制模块,被配置成在获得所述虚拟机位控制指令时,根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制。
本申请实施例还提供一种电子设备,包括存储器、处理器和存储于该存储器并能够在该处理器上运行的计算机程序,该计算机程序在该处理器上运行时实现上述的虚拟形象控制方法。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被执行时实现上述的虚拟形象控制方法。
附图说明
图1为本申请实施例提供的电子设备的方框示意图。
图2为本申请实施例提供的虚拟形象控制方法的流程示意图。
图3为本申请实施例提供的直播系统的系统框图。
图4为本申请实施例提供的基于缩放参数对虚拟形象进行控制的效果示意图。
图5为本申请实施例提供的基于缩放参数对虚拟形象进行控制的另一效果示意图。
图6为本申请实施例提供的基于角度参数对虚拟形象进行控制的效果示意图。
图7为本申请实施例提供的基于特征点对虚拟形象进行控制的示意图。
图8为本申请实施例提供的特征点数量和显示次数之间的对应关系的示意图。
图9为本申请实施例提供的虚拟形象控制装置包括的功能模块的方框示意图。
图标:100-电子设备;102-存储器;104-处理器;106-虚拟形象控制装置;106a-控制指令生成模块;106b-控制指令判断模块;106c-虚拟形象控制模块。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例只是本申请的一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。
因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围,而是仅仅表示本申请的选定实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。在本申请的描述中,术语“第一”、“第二”等仅用作区分描述,而不能理解为只是或暗示相对重要性。
如图1所示,本申请实施例提供了一种电子设备100。其中,该电子设备100可以作为一种直播设备,例如,该电子设备100可以是与主播在直播时使用的终端设备通信连接的 后台服务器。
示例性地,电子设备100可以包括存储器102、处理器104和虚拟形象控制装置106。存储器102和处理器104之间可以直接或间接地电性连接,以实现数据的传输或交互。例如,存储器102和处理器104之间可通过一条或多条通讯总线或信号线实现电性连接。虚拟形象控制装置106可以包括至少一个可以软件或固件(firmware)的形式存储于存储器102中的软件功能模块。处理器104可以被配置成执行存储器102中存储的可执行的计算机程序,例如,虚拟形象控制装置106所包括的软件功能模块及计算机程序等,以对直播画面中的虚拟形象进行较高精度的控制。
其中,在一些可能的实现方式中,存储器102可以是,但不限于,随机存取存储器(Random Access Memory,RAM),只读存储器(Read Only Memory,ROM),可编程只读存储器(Programmable Read-Only Memory,PROM),可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory,EEPROM)等。
另外,处理器104可以是一种集成电路芯片,具有信号的处理能力。上述的处理器104可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)、片上系统(System on Chip,SoC)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
可以理解的是,图1所示的结构仅为示意,电子设备100还可包括比图1所示结构更多或者更少的组件,或者具有与图1所示结构不同的配置,例如,还可以包括被配置成与其它直播设备(如主播使用的终端设备、观众使用的终端设备等)进行信息交互的通信单元。
结合图2,本申请实施例还提供一种可应用于上述电子设备100的虚拟形象控制方法。其中,虚拟形象控制方法有关的流程所定义的方法步骤可以由电子设备100实现。下面将对图2所示的具体流程进行详细阐述。
步骤201,对直播发起端发送的主播视频帧进行分析,生成动作控制指令。
在一可能的实施例中,直播发起端可以对正在进行网络直播的主播进行拍摄,以得到该主播对应的主播视频帧,并将该主播视频帧发送给电子设备100。
如此,电子设备100可以接收直播发起端发送的主播视频帧,并对该主播视频帧进行分析处理(如图像分析),且基于分析结果生成动作控制指令,该动作控制指令可以被配置成对直播接收端的直播画面中的虚拟形象进行控制。
步骤203,判断是否获得主播对应的虚拟机位控制指令。
在一可能的实施例中,电子设备100通过步骤201生成动作控制指令之后,还可以判断是否获得该主播对应的虚拟机位控制指令。并且,在判断出获得该虚拟机位控制指令时,可以执行步骤205。
步骤205,根据虚拟机位控制指令和动作控制指令对虚拟形象进行控制。
在一可能的实施例中,电子设备100通过步骤203判断出获得主播对应的虚拟机位控制指令时,可以基于该虚拟机位控制指令和动作控制指令对虚拟形象进行控制。也就是说,电子设备100可以在基于动作控制指令对虚拟形象进行控制的基础上,结合虚拟机位控制指令,一起对虚拟形象进行控制,从而提高控制的精度。
并且,由于采用的是虚拟机位控制指令,还可以使得虚拟形象的展示呈现出不同机位下的状态,从而在直播间营造出舞台表演的效果,使得直播的呈现感受性更强,以提高虚拟形象展示的趣味性,提升用户体验。
其中,可以理解的是,对于电子设备100执行步骤201时分析的主播视频帧,本申请实施例对于电子设备100获取该主播视频帧的方式不进行限制。
例如,在一种可能的实现方式中,结合图3,电子设备100可以为后台服务器,该后台 服务器通信连接有第一终端,且该第一终端还可以通信连接有一图像采集设备(如摄像头)。第一终端可以为主播在直播时使用的终端设备(如手机、平板电脑、电脑等),图像采集设备可以被配置成在主播直播时对主播进行图像采集,从而得到主播视频帧并将该主播视频帧通过第一终端发送至后台服务器。
需要说明的是,上述的图像采集设备既可以是作为单独的一个器件,也可以是与第一终端集成于一体;例如,在一些可能的实现方式中,图像采集设备可以是手机、平板电脑、电脑等终端设备携带的摄像头。
并且,本申请实施例对于电子设备100执行步骤201对主播视频帧进行分析的方式也不进行限制。例如,在一种可能的实现方式中,电子设备100在执行步骤201时,可以随机的在主播视频帧中进行视频帧的提取,并基于提取的视频帧生成对应的动作控制指令。
又例如,在另一种可能的实现方式中,电子设备在执行步骤201时,可以每隔预设周期提取直播发起端发送的主播视频帧中的当前视频帧,对该当前视频帧进行图像分析,并根据对该当前视频帧的图像分析结果生成动作控制指令。
也就是说,电子设备100在获取到直播发起端发送的的主播视频帧之后,可以每间隔预设周期在主播视频帧中提取一视频帧(即当前主播视频帧);然后,对提取的该视频帧进行图像分析处理(如特征提取等);最后,可以基于分析处理的结果生成对应的动作控制指令。
如此,由于电子设备是按照一定的周期进行视频帧提取,可以使得在根据提取的视频帧生成的动作控制指令对虚拟形象的动作进行控制时,既能够在较大程度上反映主播的真实动作,还能减少数据的处理量,缓解相应的处理器的压力,并能够使得直播的实时性更佳。
需要说明的是,本申请实施例对于上述的预设周期的执行策略不进行限制,例如,该预设周期既可以是预设时长(如0.1s、0.2s、0.3s等),也就是说,可以每间隔该预设时长进行一次视频帧提取操作,得到一视频帧;也可以是预设帧数(1帧、2帧、3帧等),也就是说,可以每间隔该预设帧数进行一次视频帧提取操作,得到一视频帧。
再例如,在另一种可能的实现方式中,电子设备100在步骤201时,还可以对直播发起端发送的主播视频帧中的每一主播视频帧进行图像分析,并根据每一主播视频帧的图像分析结果生成动作控制指令。
也就是说,对于获取到直播发起端发送的所有主播视频帧,电子设备100可以提取每一主播视频帧;然后,对提取的每一主播视频帧进行图像分析处理(如特征提取等);最后,可以基于每一主播视频帧的图像分析结果生成对应的动作控制指令。
如此,由于电子设备是根据每一主播视频帧分别生成对应的动作控制指令,可以在基于该动作控制指令对虚拟形象进行控制时,使得虚拟形象的动作能够完全反映主播的真实动作,使得虚拟形象的展示更为灵动、动作之间的衔接更为流畅,以提高观众的观看体验。
需要说明的是,当电子设备100执行步骤201,在进行图像分析、特征提取等处理时,可以利用训练好的神经网络对主播视频帧进行识别,以获得主播视频帧中主播的动作姿态,并基于该动作姿态,生成动作控制指令。
另外,在本申请实施例一些可能的实现方式中,电子设备100在执行步骤203时,可以判断是否接收到直播接收端发送的与主播对应的虚拟机位控制指令;也就是说,直播接收端可以将虚拟机位控制指令直接发送给电子设备100,从而由电子设备100根据接收的虚拟机位控制指令执行步骤205,以使直播接收端可以对该主播对应的虚拟形象进行控制。
其中,在一些可能的实现方式中,电子设备100在接收到直播接收端发送的虚拟机位操作指令时,电子设备100可以基于该直播接收端对应的用户历史数据确定出的第一预设条件,判断该虚拟机位操作指令是否符合第一预设条件;若虚拟机位操作指令符合第一预设条件,则电子设备100可以判定获得虚拟机位操作指令。
也就是说,首先,电子设备100会检测是否接收到直播接收端发送的虚拟机位操作指 令;然后,在接收到虚拟机位操作指令时,判断该虚拟机位操作指令是否符合基于用户历史数据确定出的第一预设条件;最后,只有在虚拟机位操作指令符合第一预设条件时,才判定获得虚拟机位操作指令。
如此,使得只有具有特定用户历史数据的用户才能对虚拟形象的显示进行控制,从而提高用户观看直播的积极性。
其中,在一些可能的实现方式中,上述的用户历史数据的具体内容可以包括,但不限于用户的等级、观看直播的时长、发送的弹幕的数量、赠送的礼物的数量或价值等。例如,只有用户的等级达到一定的等级(如10级、15级等),电子设备100才能在接收到虚拟机位控制指令时,判定获得该虚拟机位控制指令。
需要说明的是,电子设备100在基于用户历史数据对是否获得虚拟机位操作指令进行判断时,还可以进行更为精确的判定。例如,基于用户历史数据的不同,可以判定能够获得的虚拟机位操作指令的类型也不同。
在一种可能的实现方式中,以用户历史数据为用户等级为例进行说明。假设虚拟机位操作指令包括5种,分别为第一操作指令、第二操作指令、第三操作指令、第四操作操作和第五操作指令。若用户的等级属于区间[0,5],那么,只有在接收到第一操作指令时,才能判定获得该操作指令;若用户的等级属于区间(5,10],那么,只有在接收到第一操作指令或第二操作指令时,才能判定获得该操作指令;依此类推,若用户的等级属于区间(20,+∞)时,在接收到5种虚拟机位操作指令中的任意一种操作指令时,都可以判定获得该操作指令。
另外,在本申请实施例其他一些可能的实现方式,电子设备100在执行步骤203时,还可以采用例如信息提取等方式,判断是否获得基于主播对应的信息生成的虚拟机位控制指令;也就是说,直播发起端还可以不将虚拟机位控制指令直接发送给电子设备100,而是由电子设备基于主播对应的信息进行提取生成,从而根据提取生成的虚拟机位控制指令执行步骤205。
并且,电子设备100在判断是否获得基于主播的信息生成的虚拟机位控制指令时,根据虚拟机位控制指令的生成方式不同,判断的方式也可以不同。
例如,在一种可能的实现方式中(示例一),该虚拟机位控制指令可以基于主播对应的操作信息生成。示例性地,上述的第一终端可以响应主播的操作生成对应的虚拟机位控制指令,并将该虚拟机位控制指令发送至上述的后台服务器。并且,该后台服务器可以在接收到虚拟机位控制指令时,判定获得虚拟机位控制指令。
其中,在本申请实施例提供的方案中,主播对第一终端的操作的方式不受限制,可以包括,但不限于主播对第一终端上的按键(如实体按键或屏幕虚拟按键)、键盘、鼠标以及麦克风等输入设备的操作。例如,主播既可以通过键盘输入一段文字信息或通过麦克风输入一段语音信息(如“放大2倍”或“展示背面”等,或者,也可以是一些简单的数字或字词,如“1”就代表放大1倍,“2”就代表放大2倍,只需要预先建立对应关系即可),也可以通过鼠标执行特定的动作(如点击第一终端展示的虚拟形象之后,往左边、右边等方向移动鼠标,当第一终端识别到该动作之后,可以基于预先建立的对应关系生成对应的虚拟机位控制指令)。
也就是说,在一种可能的实现方式中,电子设备100在接收到基于主播对应的操作信息(通过麦克风对第一终端设备进行操作)生成的语音信息时,可以判断该语音信息中是否具有第一预设信息,并在具有该第一预设信息时,判定获取基于主播对应的操作信息生成的虚拟机位控制指令。
其中,示例性地,上述的第一预设信息可以是关键词信息或其它信息。例如,在语音信息为歌曲(如设备播放或主播唱的)时,上述的第一预设信息还可以是旋律特征信息。也就是说,电子设备100可以利用训练完成的神经网络识别第一终端发送的语音信息的旋律特征,并根据识别获得的旋律特征确定虚拟机位控制指令。例如,在轻柔的旋律中,电 子设备100可以生成头顶机位渐远的控制指令。在高潮或者副歌的旋律中,电子设备100可以生成脸部机位放大的控制指令。
又例如,在另一种可能的实现方式中(示例二),虚拟机位控制指令也可以电子设备100基于执行步骤201时对主播视频帧进行分析得到的结果生成。
也就是说,电子设备100还可以基于对直播发起端发送的主播视频帧进行分析得到的结果,判断是否获得基于主播对应的信息生成的虚拟机位控制指令。
示例性地,电子设备100可以对主播视频帧进行信息提取,以判断得到的图像信息中是否具有第二预设信息,并且,在具有该第二预设信息时,电子设备100可以基于该第二预设信息生成对应的虚拟机位控制指令,并判定获取到虚拟机位控制指令。
其中,本申请实施例对于上述的第二预设信息的具体内容不进行限制,例如,该第二预设信息可以包括,但不限于动作信息、深度信息或其它信息等。比如,示例性地,在一种可能的实现方式中,上述的第二预设信息可以为动作信息。
也就是说,在一些可能的实现方式中,电子设备100可以基于主播的特定动作生成对应的虚拟机位控制指令,例如,主播在伸出左手时,可以生成展示虚拟形象的左侧面的控制指令;主播在伸出右手时,可以生成展示虚拟形象的右侧面的控制指令;主播在左手与右手接触时,可以生成展示虚拟形象的背面的控制指令;主播在蹲下时,可以生成展示虚拟形象的头顶部的控制指令。
在一种可能的实现方式中,上述的其它信息可以是标识物件或标识颜色等信息。也就是说,主播可以携带标识物件或者穿戴具有标识颜色的衣物或配饰,使得电子设备100在执行步骤203时,可以通过识别该标识物件或者该标识颜色的方式获得虚拟机位控制指令。
例如,在一些可能的实现方式中,按照识别到的物件由大到小或识别到的颜色为红、橙、黄、绿、青、蓝或紫,可以生成机位渐近的控制指令。也就是说,在主播的不同部位携带有多种不同大小的标识物件,或穿戴有多种颜色不同的衣物或配饰时,使得主播在不同的时刻具有不同的动作时,从而当电子设备100在执行步骤203时,可以根据识别到的标识物件或标识颜色不同,控制虚拟形象可以展示出由远景到近景或由远景到近景的舞台效果。
另外,在一些可能的实现方式中,为了提高主播进行直播的积极性,电子设备100在执行步骤203时,还可以基于主播对应的历史直播数据判断是否获得虚拟机位控制指令。
示例性地,在上述的示例一中,电子设备100在接收到第一终端发送的虚拟机位控制指令之后,或者在上述的示例二中,在基于第一预设信息或者是第二预设信息生成对应的虚拟机位控制指令之后,电子设备100还可以根据基于该主播对应的用户历史数据确定出的第二预设条件,并判断虚拟机位控制指令是否符合该第二预设条件,并且,只有在该虚拟机位控制指令符合第二预设条件时,才能判定获得该虚拟机位控制指令。
其中,在一种可能的实现方式中,上述的主播对应的历史直播数据可以为主播的等级,并且,等级越高,能够判定获得的虚拟机位控制指令的数量就越多。例如,若主播的等级小于5级,可以判定不能获得任何的虚拟机位控制指令;若主播的等级大于或等于5级、小于或等于10级,可以判定能够获得部分的虚拟机位控制指令;若主播的等级大于10级,可以判定能够获得任何的虚拟机位控制指令。
需要说明的是,在上述示例中,是按照一定的等级范围对是否获得虚拟机位控制指令进行判断,在其它的一些示例中,也可以是针对每一个等级确定可以获得不同的虚拟机位控制指令。
另外,在本申请实施例其他一些可能的实现方式中,上述的主播对应的历史直播数据还可以包括主播在直播时收到的礼物的数量或价值、主播在直播时的观众的弹幕量,以及主播在直播时观看直播的最大观众数量等。例如,收到的礼物的数量越多或价值越高,弹幕量越大,或者最大观众数量越大,判定能够获得的虚拟机位控制指令可以越多。
并且,电子设备100在执行步骤203对是否获得虚拟机位控制指令进行判断之后,一 方面,在判定获得虚拟机位控制指令时,可以执行步骤205。另一方面,在判定未获得虚拟机位控制指令时,具体的处理方式不受限制;比如,在一种可能的实现方式中,电子设备100可以根据动作控制指令对虚拟形象进行控制。
也就是说,在主播进行直播时,若电子设备100有获得虚拟机位控制指令,则根据虚拟机位控制指令和动作控制指令对虚拟形象进行控制;若电子设备100未获得虚拟机位控制指令,则仅根据动作控制指令对虚拟形象进行控制。
另外,本申请实施例对于电子设备100执行步骤205的方式也不进行限制,可以根据实际应用需求进行选择,如处理器104的性能、虚拟形象的控制精度等。
例如,在一种可能的实现方式中,电子设备100执行步骤205的方式可以如下:根据动作控制指令对虚拟形象在直播画面中的展示姿态进行控制;根据虚拟机位控制指令对虚拟形象在直播画面中的展示大小进行控制,或者是对虚拟形象在直播画面中的展示角度进行控制,或者是对虚拟形象在直播画面中的展示大小和展示角度进行控制。
也就是说,一方面,电子设备100可以根据动作控制指令对虚拟形象的展示姿态进行控制;另一方面,在对虚拟形象的展示姿态进行控制的基础上,电子设备100还可以基于获得的虚拟机位控制指令,对虚拟形象在该展示姿态时的展示大小进行控制,或者是对虚拟形象在该展示姿态时的展示角度进行控制,又或者是对虚拟形象在该展示姿态时的展示大小和展示角度进行控制。
例如,若主播当前在跳舞,则电子设备100可以基于动作控制指令控制虚拟形象进行跳舞。此时,若电子设备100获得虚拟机位控制指令,则可以根据该虚拟机位控制指令,对虚拟形象在跳舞状态下不同的展示大小进行控制,或者是对虚拟形象在跳舞状态下不同的展示角度进行控制,又或者是对虚拟形象在跳舞状态下不同的展示大小和不同的展示角度进行控制。
其中,在一些可能的实现方式中,上述的展示姿态可以包括,但不限于,踢脚、拍手、弯腰、抖肩、摇头等动作,以及皱眉、大笑、微笑、怒目等表情。并且,本申请实施例对虚拟形象进行控制的方式也不进行限制,在一种可能的实现方式中,电子设备100可以基于预先确定的特征点进行控制。
另外,作为一种可能的实现方式,为提高用户的体验,电子设备100还可以基于虚拟机位操作指令携带的信息对虚拟形象进行相应的控制。也就是说,用户可以对直播接收端进行不同的操作,使得直播接收端可以基于不同的操作生成携带不同信息的虚拟机位操作指令。
其中,本申请实施例中用户对直播接收端进行操作的方式不进行限制,例如,该操作方式可以包括用户对触摸屏、鼠标、键盘、麦克风等输入设备进行操作。并且,本申请实施例对于虚拟机位操作指令中携带的信息也不进行限制,可以根据实际应用需求进行选择。
例如,在一种可能的实现方式中,虚拟机位操作指令中可以包括缩放参数。也就是说,电子设备100在执行步骤205时,可以根据缩放参数和主播视频帧中虚拟形象的初始大小控制在直播接收端的直播画面中显示的虚拟形象的展示大小。
其中,根据控制精度的需求不同,电子设备100根据缩放参数对虚拟形象的展示大小进行控制的方式也可以不同。
例如,在控制精度的需求较低时,若电子设备100获得的虚拟机位操作指令中包括缩放参数,电子设备100则控制虚拟形象在初始大小的基础上放大特定倍数(如2倍、3倍或5倍等)或缩小特定倍数(如0.2倍、0.5倍或0.8倍等)。
又例如,在控制精度的需求较高时,电子设备100可以根据虚拟机位操作指令中缩放参数的具体数值,控制虚拟形象在初始大小的基础上放大或缩小不同的倍数。如图4所示,在缩放参数为2时,可以控制虚拟形象在初始大小的基础上放大2倍;如图5所示,在缩放参数为0.5时,可以控制虚拟形象在初始大小的基础上缩小0.5倍)。
又例如,在另一种可能的实现方式中,虚拟机位操作指令中可以包括角度参数。如此, 电子设备100在执行步骤205时,可以根据该角度参数控制在直播接收端的直播画面中显示的虚拟形象的展示角度。
同理,根据控制精度的需求不同,电子设备100在根据角度参数对虚拟形象的展示角度进行控制的方式也可以不同。
例如,在控制精度的需求较低时,若电子设备100获得的虚拟机位操作指令中包括角度参数,电子设备100则可以控制虚拟形象在特定角度(如背面、左侧面或右侧面)下进行展示。
又例如,在控制精度的需求较高时,电子设备100可以根据虚拟机位操作指令中角度参数的具体数值,控制虚拟形象相应的角度下进行展示。如图6所示,在角度参数为180°时,电子设备100可以控制虚拟形象展示背面;在角度参数为90°时,电子设备100可以控制虚拟形象展示左侧面;在角度参数为270°时,电子设备100可以控制虚拟形象展示右侧面。
需要说明的是,根据实际的应用需求不同,电子设备100在基于角度参数对虚拟形象进行控制时的操作方式也可以不同。
例如,在一种可能的实现方式中,电子设备100在根据角度参数对虚拟形象进行控制时,可以控制直播接收端的直播画面停止显示主播视频帧,并获取预先针对虚拟形象构建的三维图像数据中该角度参数对应的部分三维视角数据。
也就是说,在直播接收端的直播画面显示主播视频帧的过程中,若用户对直播接收端进行操作,使得该直播接收端生成对应的虚拟机位操作指令并发送至电子设备100(后台服务器),则电子设备10可以基于该虚拟机位操作指令停止向直播接收端发送主播视频帧,以控制直播接收端的直播画面停止显示主播视频帧。
并且,电子设备100可以根据虚拟机位指令中的角度参数,在预先针对虚拟形象构建的三维图像数据中获取对应的部分三维视角数据。例如,若角度参数为90°,则电子设备100可以获取三维图像数据中左侧面对应的部分三维视角数据;若角度参数为180°,则电子设备100可以获取三维图像数据中背面对应的部分三维视角数据。最后,电子设备100将获取的部分三维视角数据发送给直播接收端进行可视化处理,以完成对虚拟形象的控制。如此,可以较快的获取到与角度参数相对应的部分三维视角数据,使得数据的处理量较小,可以有效地保证直播具有较高的实时性。
又例如,在另一种可能的实现方式中,电子设备100可以在根据角度参数对虚拟形象进行控制时,可以控制直播接收端的直播画面停止显示主播视频帧,根据该主播视频帧对预先针对虚拟形象构建的三维图像数据进行调整,并获取在调整后的三维图像数据中角度参数对应的部分三维视角数据。
也就是说,在直播接收端的直播画面显示主播视频帧的过程中,若用户对直播接收端进行操作,使得该直播接收端生成对应的虚拟机位操作指令并发送至电子设备100(后台服务器),电子设备100可以基于该虚拟机位操作指令停止向直播接收端发送主播视频帧,以控制直播接收端的直播画面停止显示主播视频帧。
并且,电子设备100可以根据主播视频帧对预先针对虚拟形象构建的三维图像数据进行调整,以得到新的三维图像数据。然后,再从新的三维图像数据中获取与角度参数对应的部分三维视角数据。例如,若角度参数为90°,则电子设备100可以获取新的三维图像数据中左侧面对应的部分三维视角数据;若角度参数为180°,则电子设备100可以获取新的三维图像数据中背面对应的部分三维视角数据。最后,电子设备100可以将获取的部分三维视角数据发送给直播接收端进行可视化处理,以完成对虚拟形象的控制。如此,可以使得获取的部分三维视角数据能够在较大程度上反映主播的实际动作,从而使得虚拟形象在展示不同的角度时,也具有较高的逼真程度,从而提升用户的体验度。
其中,本申请实施例对于电子设备100根据主播视频帧对三维图像数据进行调整的方式不进行限制。例如,在一种可能的实现方式中,电子设备100可以通过以下方式对三维 图像数据进行调整:电子设备100可以获取主播视频帧中的目标特征点的坐标信息,并基于该坐标信息计算得到虚拟形象的其它特征点的坐标信息;然后根据坐标信息对预先针对虚拟形象构建的三维图像数据进行调整。
也就是说,电子设备100可以获取主播视频帧中的各目标特征点(虚拟形象的正面的各目标特征点,如眼睛、鼻子、嘴巴、耳朵等对应的特征点)的坐标信息(三维坐标,具有深度信息);然后,电子设备100可以基于获取的坐标信息计算得到虚拟形象的其它特征点(虚拟形象的三维模型中目标特征点以外的特征点,如背面才能看到的特征点)的坐标信息;最后,电子设备100可以基于其它特征点的坐标信息,对预先构建的三维图像数据中与其它特征点对应的部分数据进行调整,以得到新的三维图像数据。
其中,基于目标特征点的坐标信息计算其它特征点的坐标信息的算法,可以是反(逆)向运动算法。
需要说明的是,在按照本申请实施例提供的上述方案对主播视频帧中各特征点进行调整后,直播接收端播放的视频帧即为:基于对三维图像数据中部分三维视角数据(正面部分)调整后的数据。另外,针对上述的目标特征点,已经完成了数据的调整;因此,在本申请实施例提供的上述方案中,只需对其它特征点对应的部分数据进行调整即可。
另外,为使在对虚拟形象进行控制时,还能够根据实际应用需求对显示虚拟形象时显示的数据量进行控制,例如,在对实时性要求较高时,可以显示较低的数据,在对控制精度要求较高时,可以显示较高的数据。
在一种可能的实现方式中,为了在保证具有较高的控制精度以确保用户的体验的基础上,还能较低数据的处理量,以使虚拟形象的直播实时性更好,电子设备100可以通过以下步骤确定显示虚拟形象时显示的数据量:获取虚拟形象基于各展示角度在直播接收端的显示次数;根据各展示角度对应的显示次数确定在基于该展示角度对虚拟形象进行显示时的数据量。
也就是说,电子设备100可以获取虚拟形象在所有的直播时间中或者在较近的一段直播时间中,各个展示角度对应的显示次数,例如,假定在最近的一个月内,展示角度为90°(左侧面)对应的显示次数为3000次,展示角度为180°(背面)对应的显示次数为7000次,展示角度为270°(右侧面)对应的显示次数为2000次。
然后,电子设备100可以基于获取的各显示次数,确定对应的展示角度的数据量。例如,若显示次数越大,可以控制显示对应的展示角度时,显示的数据量就越大。如此,在上述示例中,由于展示角度为180°(背面)时,显示次数(7000次)最大,电子设备100可以控制在基于该展示角度进行显示时的数据量也最大;由于展示角度为270°(右侧面)时,显示次数(2000次)最小,电子设备100可以控制在基于该展示角度进行显示时的数据量也最小。
其中,考虑到在对虚拟形象进行控制时,一般是基于对预先确定的特征点进行控制。因此,上述的数据量可以是指特征点的数量。也就是说,可以根据各展示角度对应的显示次数确定在基于该展示角度对虚拟形象进行显示时特征点的数量(如图7所示)。
例如,在上述示例中,展示角度为180°(背面)时,显示次数为7000次,对应地,可以控制的特征点的数量可以为300个;展示角度为90°(左侧面)时,显示次数为3000次,对应地,可以控制的特征点的数量可以为200个;展示角度为270°(右侧面)时,显示次数为2000次,对应地,可以控制的特征点的数量可以为150个。
示例性地,在一种可能的实现方式中,电子设备100可以预先建立特征点数量和显示次数之间的对应关系,从而在获取到显示次数之后,可以直接根据该对应关系得到特征点数量。如图8所示,对应关系可以为:显示次数越大,对应的特征点数量也就越大。
结合图9,本申请实施例还提供一种可应用于上述电子设备100的虚拟形象控制装置106。其中,虚拟形象控制装置106可以包括控制指令生成模块106a、控制指令判断模块106b和虚拟形象控制模块106c。
控制指令生成模块106a,可以被配置成对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,主播视频帧由直播发起端对主播进行拍摄得到,动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;在一实施例中,该控制指令生成模块106a可以执行图2所示的步骤201,关于控制指令生成模块106a的相关内容可以参照本申请实施例前述对步骤201的描述。
控制指令判断模块106b,可以被配置成判断是否获得主播对应的虚拟机位控制指令;在一实施例中,控制指令判断模块106b可以执行图2所示的步骤203,关于控制指令判断模块106b的相关内容可以参照本申请实施例前述对步骤203的描述。
虚拟形象控制模块106c,可以被配置成在获得虚拟机位控制指令时,根据虚拟机位控制指令和动作控制指令对虚拟形象进行控制;在一实施例中,虚拟形象控制模块106c可以执行图2所示的步骤205,关于虚拟形象控制模块106c的相关内容可以参照本申请实施例前述对步骤205的描述。
其中,在控制指令判断模块106b判断出未获得虚拟机位控制指令时,虚拟形象控制模块106c,还可以被配置成根据动作控制指令对虚拟形象进行控制。
在本申请实施例中,对应于上述的虚拟形象控制方法,还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序运行时执行上述虚拟形象控制方法的各个步骤。
其中,前述计算机程序运行时执行的各步骤,在此不再一一赘述,可参考前文对虚拟形象控制方法的解释说明。
综上,本申请提供的虚拟形象控制方法、装置、电子设备及存储介质,在基于直播发起端发送的主播视频帧对虚拟形象进行控制的基础上,若还获得主播对应的虚拟机位控制指令,还可以结合该虚拟机位控制指令一起对虚拟形象进行控制,以展示不同机位下的虚拟形象,从而营造出舞台表演的效果,进而提高虚拟形象展示的趣味性,提升虚拟形象直播过程中的用户体验。
以上所述仅为本申请的部分实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
工业实用性
由于在基于直播发起端发送的主播视频帧对虚拟形象进行控制的基础上,若还获得主播对应的虚拟机位控制指令,还可以结合该虚拟机位控制指令一起对虚拟形象进行控制,以展示不同机位下的虚拟形象,从而营造出舞台表演的效果,进而提高虚拟形象展示的趣味性,提升虚拟形象直播过程中的用户体验。

Claims (21)

  1. 一种虚拟形象控制方法,其特征在于,所述方法包括:
    对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,所述主播视频帧由所述直播发起端对主播进行拍摄得到,所述动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;
    判断是否获得所述主播对应的虚拟机位控制指令;
    若获得所述虚拟机位控制指令,则根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制。
  2. 根据权利要求1所述的虚拟形象控制方法,其特征在于,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
    判断是否接收到所述直播接收端发送的与所述主播对应的虚拟机位控制指令。
  3. 根据权利要求2所述的虚拟形象控制方法,其特征在于,所述判断是否接收到所述直播接收端发送的与所述主播对应的虚拟机位控制指令的步骤,包括:
    在接收到所述直播接收端发送的虚拟机位操作指令时,判断所述虚拟机位操作指令是否符合第一预设条件,其中,所述第一预设条件基于所述直播接收端对应的用户历史数据确定;
    若所述虚拟机位操作指令符合所述第一预设条件,则判定获得所述虚拟机位操作指令。
  4. 根据权利要求1所述的虚拟形象控制方法,其特征在于,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
    判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令。
  5. 根据权利要求4所述的虚拟形象控制方法,其特征在于,所述判断是否获得基于所述主播对应的信息生成的所述虚拟机位控制指令的步骤,包括:
    判断是否获得基于所述主播对应的操作信息生成的虚拟机位控制指令。
  6. 根据权利要求5所述的虚拟形象控制方法,其特征在于,所述判断是否获得基于所述主播对应的操作信息生成的虚拟机位控制指令的步骤,包括:
    在接收到基于所述主播对应的操作信息生成的语音信息时,判断该语音信息中是否具有第一预设信息,并在具有该第一预设信息时,判定获取基于所述主播对应的操作信息生成的虚拟机位控制指令。
  7. 根据权利要求6所述的虚拟形象控制方法,其特征在于,所述第一预设信息包括关键词信息和/或旋律特征信息。
  8. 根据权利要求4所述的虚拟形象控制方法,其特征在于,所述判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令的步骤,包括:
    基于对所述主播视频帧进行分析得到的结果,判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令。
  9. 根据权利要求8所述的虚拟形象控制方法,其特征在于,所述基于对所述主播视频帧进行分析得到的结果,判断是否获得基于所述主播对应的信息生成的虚拟机位控制指令的步骤,包括:
    基于对所述主播视频帧进行信息提取得到的图像信息,判断该图像信息中是否具有第二预设信息,并在具有该第二预设信息时,判定获得基于所述主播对应的信息生成的虚拟机位控制指令。
  10. 根据权利要求9所述的虚拟形象控制方法,其特征在于,所述第二预设信息包括动作信息、深度信息、标识物件信息和/或标识颜色信息。
  11. 根据权利要求1所述的虚拟形象控制方法,其特征在于,所述判断是否获得所述主播对应的虚拟机位控制指令的步骤,包括:
    在接收到所述直播接收端发送的虚拟机位操作指令时,判断所述虚拟机位操作指令是 否符合第二预设条件;其中,所述第二预设条件基于所述主播对应的用户历史数据确定;
    若所述虚拟机位操作指令符合基于所述第二预设条件,则判定获得所述虚拟机位操作指令。
  12. 根据权利要求1-11中任一项所述的虚拟形象控制方法,其特征在于,所述根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制的步骤,包括:
    根据所述动作控制指令对所述虚拟形象在所述直播画面中的展示姿态进行控制;
    根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制。
  13. 根据权利要求12所述的虚拟形象控制方法,其特征在于,所述虚拟机位操作指令中包括角度参数;
    所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
    控制所述直播画面停止显示所述主播视频帧,并获取预先针对所述虚拟形象构建的三维图像数据中所述角度参数对应的部分三维视角数据。
  14. 根据权利要求12所述的虚拟形象控制方法,其特征在于,所述虚拟机位操作指令中包括角度信息;
    所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
    控制所述直播画面停止显示所述主播视频帧,根据该主播视频帧对预先针对所述虚拟形象构建的三维图像数据进行调整,并获取在调整后的三维图像数据中所述角度参数对应的部分三维视角数据。
  15. 根据权利要求14所述的虚拟形象控制方法,其特征在于,所述根据该主播视频帧对预先针对所述虚拟形象构建的三维图像数据进行调整的步骤,包括:
    获取所述主播视频帧中的目标特征点的坐标信息,并基于该坐标信息计算得到虚拟形象的其它特征点的坐标信息;
    根据所述坐标信息对预先针对所述虚拟形象构建的三维图像数据进行调整。
  16. 根据权利要求12所述的虚拟形象控制方法,其特征在于,所述虚拟机位操作指令中包括缩放参数;
    所述根据所述虚拟机位控制指令对所述虚拟形象在所述直播画面中的展示大小和/或展示角度进行控制的步骤,包括:
    根据所述缩放参数和所述虚拟形象的初始大小确定在所述直播画面中虚拟形象的展示大小。
  17. 根据权利要求12所述的虚拟形象控制方法,其特征在于,还包括:
    获取所述虚拟形象基于各展示角度在所述直播接收端的显示次数;
    根据各展示角度对应的显示次数确定在基于该展示角度对所述虚拟形象进行显示时的数据量。
  18. 根据权利要求1所述的虚拟形象控制方法,其特征在于,所述对直播发起端发送的主播视频帧进行分析,生成动作控制指令的步骤,包括:
    对直播发起端发送的每一主播视频帧进行图像分析,并根据每一主播视频帧的图像分析结果生成动作控制指令;或
    每隔预设周期提取直播发起端发送的主播视频帧中的当前视频帧,对该当前视频帧进行图像分析,并根据对该当前视频帧的图像分析结果生成动作控制指令。
  19. 一种虚拟形象控制装置,其特征在于,所述装置包括:
    控制指令生成模块,被配置成对直播发起端发送的主播视频帧进行分析,生成动作控制指令;其中,所述主播视频帧由所述直播发起端对主播进行拍摄得到,所述动作控制指令被配置成对直播接收端直播画面中的虚拟形象进行控制;
    控制指令判断模块,被配置成判断是否获得所述主播对应的虚拟机位控制指令;
    虚拟形象控制模块,被配置成在获得所述虚拟机位控制指令时,根据所述虚拟机位控制指令和所述动作控制指令对所述虚拟形象进行控制。
  20. 一种电子设备,其特征在于,包括存储器、处理器和存储于该存储器并能够在该处理器上运行的计算机程序,该计算机程序在该处理器上运行时实现权利要求1-18任意一项所述的虚拟形象控制方法。
  21. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被执行时实现权利要求1-18任意一项所述的虚拟形象控制方法。
PCT/CN2020/087139 2019-04-30 2020-04-27 一种虚拟形象控制方法、装置、电子设备及存储介质 WO2020221186A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
SG11202111640RA SG11202111640RA (en) 2019-04-30 2020-04-27 Virtual image control method, apparatus, electronic device and storage medium
US17/605,476 US20220214797A1 (en) 2019-04-30 2020-04-27 Virtual image control method, apparatus, electronic device and storage medium

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910358384.7A CN110087121B (zh) 2019-04-30 2019-04-30 虚拟形象显示方法、装置、电子设备和存储介质
CN201910358491.X 2019-04-30
CN201910358491.XA CN110119700B (zh) 2019-04-30 2019-04-30 虚拟形象控制方法、虚拟形象控制装置和电子设备
CN201910358384.7 2019-04-30

Publications (1)

Publication Number Publication Date
WO2020221186A1 true WO2020221186A1 (zh) 2020-11-05

Family

ID=73029459

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087139 WO2020221186A1 (zh) 2019-04-30 2020-04-27 一种虚拟形象控制方法、装置、电子设备及存储介质

Country Status (3)

Country Link
US (1) US20220214797A1 (zh)
SG (1) SG11202111640RA (zh)
WO (1) WO2020221186A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113596506A (zh) * 2021-08-09 2021-11-02 深圳市佳创视讯技术股份有限公司 直播缓存的性能优化方法、系统、电子装置及存储介质
CN113784160A (zh) * 2021-09-09 2021-12-10 北京字跳网络技术有限公司 视频数据生成方法、装置、电子设备及可读存储介质
CN113973190A (zh) * 2021-10-28 2022-01-25 联想(北京)有限公司 视频虚拟背景图像处理方法、装置及计算机设备
WO2022213727A1 (zh) * 2021-04-07 2022-10-13 北京字跳网络技术有限公司 直播交互方法、装置、电子设备和存储介质
CN115243064A (zh) * 2022-07-18 2022-10-25 北京字跳网络技术有限公司 一种直播控制方法、装置、设备和存储介质
CN117608410A (zh) * 2024-01-17 2024-02-27 山东五纬数字科技有限公司 一种3d虚拟数字人的交互系统及方法
CN117608410B (zh) * 2024-01-17 2024-05-31 山东五纬数字科技有限公司 一种3d虚拟数字人的交互系统及方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1783947A (zh) * 2005-09-30 2006-06-07 王小明 节目自动化制作系统及方法
EP1912175A1 (en) * 2006-10-09 2008-04-16 Muzlach AG System and method for generating a video signal
CN104866101A (zh) * 2015-05-27 2015-08-26 世优(北京)科技有限公司 虚拟对象的实时互动控制方法及装置
CN106789991A (zh) * 2016-12-09 2017-05-31 福建星网视易信息系统有限公司 一种基于虚拟场景的多人互动方法及系统
CN106937154A (zh) * 2017-03-17 2017-07-07 北京蜜枝科技有限公司 处理虚拟形象的方法及装置
CN107438183A (zh) * 2017-07-26 2017-12-05 北京暴风魔镜科技有限公司 一种虚拟人物直播方法、装置及系统
CN107750014A (zh) * 2017-09-25 2018-03-02 迈吉客科技(北京)有限公司 一种连麦直播方法和系统

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101017362B1 (ko) * 2004-01-08 2011-02-28 삼성전자주식회사 다이나믹 영상 재생을 위한 자동 줌 장치 및 방법
US20100201693A1 (en) * 2009-02-11 2010-08-12 Disney Enterprises, Inc. System and method for audience participation event with digital avatars
US8315791B2 (en) * 2010-06-18 2012-11-20 Nokia Coporation Method and apparatus for providing smart zooming of a geographic representation
US8954853B2 (en) * 2012-09-06 2015-02-10 Robotic Research, Llc Method and system for visualization enhancement for situational awareness
US10291725B2 (en) * 2012-11-21 2019-05-14 H4 Engineering, Inc. Automatic cameraman, automatic recording system and automatic recording network
US10931920B2 (en) * 2013-03-14 2021-02-23 Pelco, Inc. Auto-learning smart tours for video surveillance
WO2016145129A1 (en) * 2015-03-09 2016-09-15 Ventana 3D, Llc Avatar control system
US20170034237A1 (en) * 2015-07-28 2017-02-02 Giga Entertainment Media Inc. Interactive Content Streaming Over Live Media Content
CN106412681B (zh) * 2015-07-31 2019-12-24 腾讯科技(深圳)有限公司 弹幕视频直播方法及装置
US9781349B2 (en) * 2016-01-05 2017-10-03 360fly, Inc. Dynamic field of view adjustment for panoramic video content
US10313417B2 (en) * 2016-04-18 2019-06-04 Qualcomm Incorporated Methods and systems for auto-zoom based adaptive video streaming
US20180152736A1 (en) * 2016-11-30 2018-05-31 Harold Glen Alexander Live video recording, streaming, viewing, and storing mobile application, and systems and methods of use thereof
US10967255B2 (en) * 2017-05-26 2021-04-06 Brandon Rosado Virtual reality system for facilitating participation in events

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1783947A (zh) * 2005-09-30 2006-06-07 王小明 节目自动化制作系统及方法
EP1912175A1 (en) * 2006-10-09 2008-04-16 Muzlach AG System and method for generating a video signal
CN104866101A (zh) * 2015-05-27 2015-08-26 世优(北京)科技有限公司 虚拟对象的实时互动控制方法及装置
CN106789991A (zh) * 2016-12-09 2017-05-31 福建星网视易信息系统有限公司 一种基于虚拟场景的多人互动方法及系统
CN106937154A (zh) * 2017-03-17 2017-07-07 北京蜜枝科技有限公司 处理虚拟形象的方法及装置
CN107438183A (zh) * 2017-07-26 2017-12-05 北京暴风魔镜科技有限公司 一种虚拟人物直播方法、装置及系统
CN107750014A (zh) * 2017-09-25 2018-03-02 迈吉客科技(北京)有限公司 一种连麦直播方法和系统

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022213727A1 (zh) * 2021-04-07 2022-10-13 北京字跳网络技术有限公司 直播交互方法、装置、电子设备和存储介质
CN113596506A (zh) * 2021-08-09 2021-11-02 深圳市佳创视讯技术股份有限公司 直播缓存的性能优化方法、系统、电子装置及存储介质
CN113596506B (zh) * 2021-08-09 2024-03-12 深圳市佳创视讯技术股份有限公司 直播缓存的性能优化方法、系统、电子装置及存储介质
CN113784160A (zh) * 2021-09-09 2021-12-10 北京字跳网络技术有限公司 视频数据生成方法、装置、电子设备及可读存储介质
CN113973190A (zh) * 2021-10-28 2022-01-25 联想(北京)有限公司 视频虚拟背景图像处理方法、装置及计算机设备
CN115243064A (zh) * 2022-07-18 2022-10-25 北京字跳网络技术有限公司 一种直播控制方法、装置、设备和存储介质
CN115243064B (zh) * 2022-07-18 2023-11-10 北京字跳网络技术有限公司 一种直播控制方法、装置、设备和存储介质
CN117608410A (zh) * 2024-01-17 2024-02-27 山东五纬数字科技有限公司 一种3d虚拟数字人的交互系统及方法
CN117608410B (zh) * 2024-01-17 2024-05-31 山东五纬数字科技有限公司 一种3d虚拟数字人的交互系统及方法

Also Published As

Publication number Publication date
SG11202111640RA (en) 2021-11-29
US20220214797A1 (en) 2022-07-07

Similar Documents

Publication Publication Date Title
WO2020221186A1 (zh) 一种虚拟形象控制方法、装置、电子设备及存储介质
CN110119700B (zh) 虚拟形象控制方法、虚拟形象控制装置和电子设备
US20210312161A1 (en) Virtual image live broadcast method, virtual image live broadcast apparatus and electronic device
CN103353935B (zh) 一种用于智能家居系统的3d动态手势识别方法
CN110136229B (zh) 一种用于实时虚拟换脸的方法与设备
WO2020082902A1 (zh) 视频的音效处理方法及相关产品
CN106730815B (zh) 一种易实现的体感互动方法及系统
CN111580652B (zh) 视频播放的控制方法、装置、增强现实设备及存储介质
US20150229838A1 (en) Photo composition and position guidance in a camera or augmented reality system
CN111045511B (zh) 基于手势的操控方法及终端设备
WO2021213067A1 (zh) 物品显示方法、装置、设备及存储介质
US20190012796A1 (en) Image processing apparatus and method
CN112379812A (zh) 仿真3d数字人交互方法、装置、电子设备及存储介质
CN110413108B (zh) 虚拟画面的处理方法、装置、系统、电子设备及存储介质
CN110401810B (zh) 虚拟画面的处理方法、装置、系统、电子设备及存储介质
WO2021114710A1 (zh) 直播视频互动方法、装置以及计算机设备
US20150172634A1 (en) Dynamic POV Composite 3D Video System
CN111147880A (zh) 视频直播的互动方法、装置、系统、电子设备及存储介质
WO2022227393A1 (zh) 图像拍摄方法及装置、电子设备和计算机可读存储介质
CN112069863B (zh) 一种面部特征的有效性判定方法及电子设备
US20150244984A1 (en) Information processing method and device
US20190138803A1 (en) Image processing apparatus, image processing system, and image processing method, and program
CN111510769A (zh) 视频图像处理方法、装置及电子设备
CN113676720B (zh) 多媒体资源的播放方法、装置、计算机设备及存储介质
CN112449098B (zh) 一种拍摄方法、装置、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20798789

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20798789

Country of ref document: EP

Kind code of ref document: A1