CN112689151B - Live broadcast method and device, computer equipment and storage medium - Google Patents

Live broadcast method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112689151B
CN112689151B CN202011416720.8A CN202011416720A CN112689151B CN 112689151 B CN112689151 B CN 112689151B CN 202011416720 A CN202011416720 A CN 202011416720A CN 112689151 B CN112689151 B CN 112689151B
Authority
CN
China
Prior art keywords
live video
live
virtual object
video
specified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011416720.8A
Other languages
Chinese (zh)
Other versions
CN112689151A (en
Inventor
刘严
陈权
邓生全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Iwin Visual Technology Co ltd
Original Assignee
Shenzhen Iwin Visual Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Iwin Visual Technology Co ltd filed Critical Shenzhen Iwin Visual Technology Co ltd
Priority to CN202011416720.8A priority Critical patent/CN112689151B/en
Publication of CN112689151A publication Critical patent/CN112689151A/en
Application granted granted Critical
Publication of CN112689151B publication Critical patent/CN112689151B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The application discloses a live broadcast method, a live broadcast device, computer equipment and a storage medium, and belongs to the technical field of live broadcast. The method is applied to a personal computer and comprises the following steps: acquiring a live video shot by a mobile terminal as a first live video through a short-distance communication technology; carrying out target detection and identification on the first live video; if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object; overlaying the virtual object to the first live video to obtain a second live video; and sending the second live video to a live server. In this application, carry out the shooting of live video through mobile terminal, the removal of being more convenient for is shot and is removed live, utilizes the big and fast advantage of arithmetic speed of personal computer memory space, handles the live video that mobile terminal shot, superposes the virtual object that corresponds with the appointed object in this live video promptly in the live video to the expressive force of reinforcing live.

Description

Live broadcast method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of live broadcast technologies, and in particular, to a live broadcast method and apparatus, a computer device, and a storage medium.
Background
With the development of science and technology, mobile terminals (such as mobile phones, tablet computers and the like) have more and more functions. For example, the user can use the live broadcast function of the mobile terminal to shoot live broadcast video for other users to watch, thereby enriching the entertainment life of people.
In the related art, the content live broadcast by the user using the mobile terminal is generally limited to the image shot by the mobile terminal, and the expressive force is poor, because the content live broadcast by the user using the mobile terminal is limited by the computing capability of the mobile terminal.
Disclosure of Invention
The embodiment of the application provides a live broadcast method, a live broadcast device, computer equipment and a storage medium, which can be used for live broadcast by combining stronger mobile capability of a mobile terminal and stronger processing capability of a personal computer, so that live broadcast content can be richer, and live broadcast expressive force can be improved. The technical scheme is as follows:
in a first aspect, a live broadcast method is provided, which is applied to a personal computer, and the method includes:
acquiring a live video shot by a mobile terminal as a first live video through a short-distance communication technology;
carrying out target detection and identification on the first live video;
if a specified object is identified in the first live video, acquiring a virtual object corresponding to the specified object;
overlaying the virtual object to the first live video to obtain a second live video;
and sending the second live video to a live server.
In this application, carry out the shooting of live video through mobile terminal, be more convenient for remove shoot and remove live. The personal computer can acquire the live video shot by the mobile terminal through a short-distance communication technology, and the acquired live video is processed by utilizing the advantages of large memory capacity and high operation speed of the personal computer, namely, the virtual object corresponding to the specified object in the live video is superposed in the live video, so that the content of the live video is richer, and the live expressive force is enhanced.
Optionally, the short-range communication technology comprises at least one of bluetooth technology, zigbee technology, wireless fidelity technology, and serial bus technology.
Optionally, the designated object is a designated scene or a designated object, and the virtual object is an augmented reality model.
Optionally, the performing target detection and identification on the first live video includes:
inputting each frame of video image in the first live video into a target recognition model, and outputting the position of each detection frame in each frame of video image and the category of a specified object in each detection frame by the target recognition model;
and determining the position of each detection frame as the position of the specified object in each detection frame.
Optionally, if a specified object is identified in the first live video, acquiring a virtual object corresponding to the specified object includes:
if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object according to at least one of the category of the designated object and the position of the designated object in the first live video.
Optionally, the overlaying the virtual object to the first live video to obtain a second live video includes:
and according to the position of the specified object in the first live video, overlapping the virtual object to the first live video to obtain a second live video.
Optionally, the overlaying the virtual object to the first live video according to the position of the specified object in the first live video to obtain a second live video includes:
displaying the virtual object and the first live video in an overlapping mode according to the position of the specified object in the first live video;
if an adjusting instruction for the displayed virtual object is detected, adjusting at least one of the size and the position of the virtual object when the virtual object is overlaid with the first live video according to the adjusting instruction;
and if the confirmation instruction is detected, taking the first live video overlaid with the virtual object as a second live video.
In a second aspect, a live broadcasting device is provided, which is applied to a personal computer, and the device includes:
the first acquisition module is used for acquiring a live video shot by the mobile terminal as a first live video through a short-distance communication technology;
the detection and identification module is used for carrying out target detection and identification on the first direct-playing video;
the second acquisition module is used for acquiring a virtual object corresponding to a specified object if the specified object is identified in the first live video;
the superposition module is used for superposing the virtual object to the first live video to obtain a second live video;
and the sending module is used for sending the second live broadcast video to a live broadcast server.
Optionally, the short-range communication technology comprises at least one of bluetooth technology, zigbee technology, wireless fidelity technology, serial bus technology.
Optionally, the designated object is a designated scene or a designated object, and the virtual object is an augmented reality model.
Optionally, the detection identification module is configured to: inputting each frame of video image in the first live video into a target recognition model, and outputting the position of each detection frame in each frame of video image and the category of a specified object in each detection frame by the target recognition model; and determining the position of each detection frame as the position of the specified object in each detection frame.
Optionally, the apparatus further comprises: a neural network training module to: obtaining a plurality of training samples, wherein each training sample in the plurality of training samples comprises a sample image and a sample label, the sample image comprises a specified object, and the sample label is a category of the specified object contained in the sample image; and training a neural network model by using the plurality of training samples to obtain the target recognition model.
Optionally, the second obtaining module is configured to: if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object according to at least one of the category of the designated object and the position of the designated object in the first live video.
Optionally, the superimposing module is configured to: and according to the position of the specified object in the first live video, overlapping the virtual object to the first live video to obtain a second live video.
Optionally, the superimposing module is configured to: displaying the virtual object and the first live video in an overlapping manner according to the position of the specified object in the first live video; if an adjusting instruction for the displayed virtual object is detected, adjusting at least one of the size and the position of the virtual object when the virtual object is overlaid with the first live video according to the adjusting instruction; and if the confirmation instruction is detected, taking the first live video overlaid with the virtual object as a second live video.
In a third aspect, a computer device is provided, the computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the computer program, when executed by the processor, implementing the live broadcast method of the first aspect described above.
In a fourth aspect, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the live broadcasting method of the first aspect described above.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the live method of the first aspect as described above.
It is to be understood that, for the beneficial effects of the second aspect, the third aspect, the fourth aspect and the fifth aspect, reference may be made to the description of the first aspect, and details are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of a live broadcast system provided in an embodiment of the present application;
fig. 2 is a flowchart of a live broadcasting method provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a live broadcasting device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Wherein, the meanings represented by the reference numerals of the figures are respectively as follows:
12. a mobile terminal;
14. a personal computer;
16. a live broadcast server;
301. a first acquisition module;
302. a detection identification module;
303. a second acquisition module;
304. a superposition module;
305. a sending module;
40. a computer device;
41. a memory;
42. a computer program;
43. a processor.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be understood that reference to "a plurality" in this application refers to two or more. In the description of this application, "/" indicates an inclusive meaning, for example, A/B may indicate either A or B; "and/or" herein is only an association relationship describing an associated object, and means that there may be three relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, for the convenience of clearly describing the technical solutions of the present application, the words "first", "second", and the like are used to distinguish the same items or similar items having substantially the same functions and actions. Those skilled in the art will appreciate that the terms "first," "second," and the like do not denote any order or importance, but rather the terms "first," "second," and the like do not denote any order or importance.
Before explaining the embodiments of the present application in detail, an application scenario of the embodiments of the present application will be described.
In the related art, when a mobile terminal (such as a mobile phone, a tablet computer, etc.) is used for live broadcasting, a live video is usually shot by the mobile terminal, and the live video is uploaded to a live broadcasting server for other users to watch. In the process, the operation capability of the mobile terminal is limited, the mobile terminal cannot further process the live video, so that the live content is only limited to the image shot by the mobile terminal, and the expressive force is poor.
Therefore, the embodiment of the application provides a live broadcast method, which can be used for live broadcast by combining the strong mobile capability of a mobile terminal and the strong processing capability of a Personal Computer (PC), so that the live broadcast content can be richer, and the live broadcast expressive force can be improved.
The system architecture according to the embodiments of the present application is described below.
Fig. 1 is a schematic diagram of a live broadcast system provided in an embodiment of the present application. Referring to fig. 1, the live system includes a mobile terminal 12, a personal computer 14, and a live server 16.
The mobile terminal 12 may be a terminal with strong mobility, for example, the mobile terminal 12 may be a mobile terminal such as a mobile phone or a tablet computer. A camera may be provided in the mobile terminal 12 for capturing live video.
The personal computer 14 may be a relatively powerful device, such as a desktop, all-in-one, or laptop computer, for example, as the personal computer 14. As in the embodiment shown in fig. 1, personal computer 14 is a desktop computer that includes a host, a display, and a keyboard.
The personal computer 14 has greater processing power than the mobile terminal 12; the mobile terminal 12 has greater mobility than the personal computer 14. The mobile terminal 12 and the personal computer 14 can be connected by short-range communication technology, so that the personal computer 14 can obtain live video shot by the mobile terminal 12 through the communication connection.
The short-range communication technology here includes at least one of Bluetooth (Bluetooth) technology, zigBee (ZigBee) technology, wifi (Wireless-Fidelity) technology, and USB (Universal Serial Bus) technology. The Bluetooth technology, the Zigbee technology and the wireless fidelity technology belong to short-distance wireless communication technologies. The serial bus technology belongs to the short-distance wired communication technology. The mobile terminal 12 and the personal computer 14 are communicatively coupled via short-range communication techniques to enable the transfer of information between the mobile terminal 12 and the personal computer 14. The information here may include not only video image information but also control information.
The live server 16 is also called a live server. The personal computer 14 and the live broadcast server 16 may be communicatively coupled via a wired or wireless network, and the personal computer 14 may transmit live video to the live broadcast server 16 via the communicative coupling. After receiving the live video, the live server 16 may send the live video to other terminals for users of the other terminals to watch.
The live broadcast method provided in the embodiments of the present application is explained in detail below.
Fig. 2 is a flowchart of a live broadcasting method according to an embodiment of the present application. Referring to fig. 2, the method comprises the steps of:
and S100, the personal computer and the mobile terminal establish communication connection through a short-distance communication technology.
Illustratively, the personal computer and the mobile terminal can both turn on the bluetooth function, so that the personal computer and the mobile terminal establish a communication connection through the bluetooth technology. Or, the personal computer and the mobile terminal both start the zigbee function, so that the personal computer and the mobile terminal establish communication connection through the zigbee technology. Or, the personal computer and the mobile terminal both start the wireless fidelity function, so that the personal computer and the mobile terminal establish communication connection through the wireless fidelity technology. Alternatively, a communication connection is established between the personal computer and the mobile terminal through a serial bus. Of course, the personal computer and the mobile terminal may also establish a communication connection through other short-range communication technologies, which is not limited in this embodiment of the application.
S200, shooting live video by the mobile terminal.
Before shooting live video, the mobile terminal can start the camera shooting function of the mobile terminal. In one possible manner, the mobile terminal may start the camera shooting function of the mobile terminal when detecting the camera shooting start instruction. Or, the mobile terminal may turn on the camera function of the mobile terminal upon receiving a camera turn-on message transmitted by the personal computer. Generally, a personal computer may transmit a camera-on message to a mobile terminal through a WDM (Windows Driver Model).
The camera shooting starting instruction is used for indicating to start the camera shooting function of the mobile terminal. The camera shooting starting instruction can be triggered by a user on the mobile terminal, and the user can trigger the camera shooting starting instruction through operations such as click operation, sliding operation, voice operation, gesture operation and motion sensing operation.
The camera shooting starting message is used for indicating to start the camera shooting function of the mobile terminal. The camera shooting start message may be sent by the personal computer to the mobile terminal when the camera shooting start instruction is detected. The camera shooting starting instruction can be triggered by a user on a personal computer, and the user can trigger the camera shooting starting instruction through operations such as click operation, sliding operation, voice operation, gesture operation and somatosensory operation.
S300, the personal computer acquires the live video shot by the mobile terminal through a short-distance communication technology to serve as a first live video.
In the embodiment of the present application, for convenience of description and distinction, a live video that is taken by a mobile terminal and transmitted to a personal computer is referred to as a first live video. The "first" in the first live video is for distinguishing from the "second" in the second live video described below.
Specifically, after the mobile terminal shoots the live video, the shot live video can be directly transmitted to the personal computer through the short-distance communication technology. In this case, the personal computer also acquires the live video shot by the mobile terminal through the short-distance communication technology, and the acquired live video may be referred to as a first live video.
Optionally, if the operating system of the mobile terminal is an Android system, the mobile terminal may send the live video to the personal computer through an Android debug bridge (adb). If the operating system of the mobile terminal is an IOS system, live video can be sent to a personal computer through an Airplay wireless technology.
It is worth noting that in the embodiment of the application, the personal computer acquires the live video shot by the mobile terminal through the short-distance communication technology, which is equivalent to that the personal computer takes the camera of the mobile terminal as a virtual camera of the personal computer to shoot the live video, at this time, the personal computer can use the camera of the mobile terminal in a mode of using a common camera of the personal computer, that is, the personal computer can be instructed to shoot the live video by the mobile terminal.
S400, the personal computer carries out target detection and identification on the first direct playing video.
Target detection includes detecting a position of a specified object in a first live video. The target recognition includes recognizing a type of a specified object in the first live video.
Specifically, the operation of step S400 may be: inputting each frame of video image in the first direct-playing video into a target recognition model by the personal computer, and outputting the position of each detection frame in each frame of video image and the category of a specified object in each detection frame by the target recognition model; the position of each detection frame is determined as the position of the designated object within this detection frame.
The target recognition model may be a pre-trained model that can determine a detection frame in which a specific object appearing in the video image is located, and can determine a category of the detected specific object. That is, the target recognition model is used for performing target detection and recognition on the video image.
The detection frame is used for indicating an area in the video image where the specified object exists, and the specified object is completely located in the detection frame. The position of a certain detection frame can thus be determined as the position of the specified object within this detection frame. In general, the detection frame may be rectangular.
The designated object may be some object designated in advance, for example, the designated object may be a designated scene or a designated object. The category of the designated scene here may be sky or grassland, and may be day or night. The category of the specified object here may be a person, a utility pole, an automobile, or the like.
Further, before the personal computer inputs each frame of video image in the first direct-playing video into the target recognition model, the personal computer may also train to obtain the target recognition model. Specifically, the personal computer may obtain a plurality of training samples, and train the neural network model using the plurality of training samples to obtain the target recognition model.
The plurality of training samples may be preset. Each training sample in the plurality of training samples comprises a sample image and a sample label, wherein the sample image comprises a specified object, and the sample label is a category of the specified object contained in the sample image. That is, the input data in each of the plurality of training samples is a sample image containing a specific object, and the sample is labeled as a class of the specific object.
The neural network model may include a plurality of network layers including an input layer, a plurality of hidden layers, and an output layer. The input layer is responsible for receiving input data; the output layer is responsible for outputting the processed data; the plurality of hidden layers are positioned between the input layer and the output layer and are responsible for processing data, and the plurality of hidden layers are invisible to the outside. For example, the neural network model may be a deep neural network or the like, and may be a convolutional neural network or the like in the deep neural network.
When the personal computer uses a plurality of training samples to train the neural network model, for each training sample in the plurality of training samples, the input data in the training sample can be input into the neural network model to obtain output data; determining a loss value between the output data and a sample marker in the training sample by a loss function; and adjusting parameters in the neural network model according to the loss value. After parameters in the neural network model are adjusted by using each training sample in the plurality of training samples, the neural network model with the adjusted parameters is the target recognition model.
The operation of the personal computer to adjust the parameters in the neural network model according to the loss value may refer to the related art, which is not described in detail in this embodiment.
For example, a personal computer may be represented by a formula
Figure BDA0002820296160000092
To adjust any one of the parameters in the neural network model. Wherein +>
Figure BDA0002820296160000091
Is the adjusted parameter. w is a parameter before adjustment. α is a learning rate, α may be preset, for example, α may be 0.001, 0.000001, and the like, which is not limited in this embodiment of the application. dw is the partial derivative of the loss function with respect to w, which can be found from the loss value.
It is noted that the personal computer may also perform online updates of the target recognition model during use of the target recognition model. Specifically, after inputting a certain frame of video image in the first live video into the target recognition model, if the target recognition model outputs the position of the detection frame in the frame of video image and the type of the designated object in the detection frame, it indicates that the frame of video image contains the designated object, at this time, the frame of video image may be used as a sample image in a training sample, the type of the designated object contained in the frame of video image is used as a sample mark in the training sample, and then the training sample is used to train the target recognition model, so as to implement online update of the target recognition model.
In the embodiment of the application, the target recognition model is a deep learning neural network model, the more training samples of the deep learning network are utilized, and the more accurate recognition result is obtained, so that the personal computer can continuously update the target recognition model on line according to the video image of the live broadcast video shot in the same scene, the live broadcast experience in the scene can be continuously improved, and the long-term live broadcast is facilitated.
In an embodiment of the application, a personal computer may train a neural network model based on a machine learning framework and use the trained object recognition model. The machine learning framework may be, for example, tensrflow, caffe, apache Singa, or the like, which is not limited in this embodiment.
S500, if the personal computer identifies the specified object in the first live video, the personal computer acquires a virtual object corresponding to the specified object.
The virtual object may be an augmented reality model, such as a picture, a video, a three-dimensional model, and the like, and of course, the virtual object may also be other types of virtual objects, which is not limited in this embodiment of the present application. Different designated objects may correspond to different virtual objects. A plurality of virtual objects, each of which is a virtual object that can be displayed in superimposition with a video image, may be stored in advance in the personal computer.
Specifically, the operation of step S500 may be: and if the personal computer identifies the specified object in the first live video, acquiring a virtual object corresponding to the specified object according to at least one of the type of the specified object and the position of the specified object in the first live video.
As an example, a personal computer may store a correspondence relationship between a designated object category and a virtual object in advance, and the personal computer may acquire a corresponding virtual object from the correspondence relationship according to the category of the designated object identified in the first live video.
For example, when the designated object is a designated scene, such as a space, the virtual object corresponding to the designated object may be an augmented reality model in the form of an airplane or a hot air balloon. When the designated object is a designated object, such as a human, the virtual object corresponding to the designated object may be an augmented reality model in the form of a wing.
As another example, a personal computer may store a correspondence relationship between the position of the designated object and the virtual object in advance, and the personal computer may acquire the corresponding virtual object from the correspondence relationship according to the position of the designated object identified in the first live video.
For example, when the designated object is in the middle of the first live video, the virtual object may be an augmented reality model in the form of a crown. If the specified object is at an edge position of the first live video, the virtual object may be an augmented reality model in the shape of a petal.
As still another example, a correspondence relationship between a category of the designated object, a position of the designated object, and the virtual object may be stored in advance in the personal computer, and the personal computer may acquire the corresponding virtual object from the correspondence relationship according to the category of the designated object identified in the first live video and the position of the designated object in the first live video.
For example, when the designated object is a designated object, such as a car, if the car is in the middle of the first live video, the virtual object corresponding to the designated object may be an augmented reality model in the form of a white wing. If the automobile is at the edge position of the first live video, the virtual object corresponding to the designated object may be an augmented reality model in a gold wing shape.
S600, the personal computer overlays the virtual object to the first live video to obtain a second live video.
Overlaying the virtual object onto the first live video means overlaying the virtual object on the first live video. When the virtual object is opaque, the portion of the first live video that is covered by the virtual object may not appear on the second live video. When the virtual object has a certain transparency, the part of the first live video covered by the virtual object appears on the second live video together with the virtual object, and the color depth of the part of the first live video covered by the virtual object and the color depth of the virtual object are determined by the transparency of the virtual object.
And overlaying the virtual object to the first live video, so that the augmented reality processing of the first live video can be completed. In other words, the first live video is a live video captured by the mobile terminal, which is acquired by the personal computer but is not processed. The second live video is obtained after the personal computer performs augmented reality processing on the first live video.
Specifically, the operation of step S600 may be: and the personal computer overlays the virtual object to the first live video according to the position of the specified object in the first live video to obtain a second live video.
When the personal computer overlays the virtual object on the first live video, the position of the virtual object needs to be obtained according to the position of the specified object in the first live video.
In a possible mode, the personal computer can automatically superimpose the virtual object on the first live video according to the position of the specified object in the first live video to obtain the second live video.
In this manner, the position of the overlay of the virtual object in the first live video may be automatically determined based on the position of the specified object in the first live video. For example, the superimposition position of the virtual object in the first live video may be a position of the designated object in the first live video, or the superimposition position of the virtual object in the first live video may be an edge side position of the designated object in the first live video.
In another possible mode, the personal computer can display the virtual object and the first live video in an overlapping mode according to the position of the specified object in the first live video; if an adjusting instruction aiming at the displayed virtual object is detected, adjusting at least one of the size and the position of the virtual object when the virtual object is overlapped with the first live video according to the adjusting instruction; and if the confirmation instruction is detected, taking the first live video overlaid with the virtual object as a second live video.
In the embodiment of the present application, the personal computer may include not only a host computer for transmitting, receiving and processing information, but also a display connected to the host computer for displaying images, and a keyboard/mouse connected to the host computer for inputting instructions.
After the personal computer detects the designated object in the first live video and acquires the virtual object corresponding to the designated object, the personal computer may overlay the virtual object and the first live video according to the position of the designated object in the first live video, and display the overlaid picture through a display connected to the host. For example, the host may transmit the superimposed picture to a display for display through an HDMI (High Definition Multimedia Interface).
Alternatively, the personal computer may detect an adjustment instruction for adjusting the size, position, or the like of the displayed virtual object through an input device such as a keyboard or a mouse. Alternatively, the adjustment instruction may be triggered by a user on a personal computer, and the user may trigger the adjustment instruction by an operation such as a click operation, a slide operation, a voice operation, a gesture operation, or a body sensing operation.
For example, the designated object in the first live video is a person, and the virtual object corresponding to the designated object is an augmented reality model in a wing shape. At this time, when the virtual object is displayed in superposition with the first live video, the personal computer can adjust the size and/or position of the virtual object (the augmented reality model in the form of the wing) after obtaining the adjustment instruction through the input device. Such as to enlarge the wing, reduce the wing, or/and, adjust the position of the wing relative to the person, etc.
After the personal computer adjusts the displayed virtual object, if a confirmation instruction is detected, it is determined that the adjustment of the displayed virtual object has been completed, that is, a live video satisfying the user requirement has been obtained, so that the currently displayed first live video on which the virtual object is superimposed can be used as the second live video. The confirmation instruction may be triggered by a user on a personal computer, and the user may trigger the confirmation instruction by an operation such as a click operation, a slide operation, a voice operation, a gesture operation, or a motion sensing operation.
S700, the personal computer sends the second live broadcast video to a live broadcast server.
And after obtaining the second live broadcast video, the personal computer sends the second live broadcast video to the live broadcast server. And after receiving the second live video, the live server can send the second live video to other terminals so as to be watched by users of other terminals.
In the embodiment of the application, live video shooting is carried out through the mobile terminal, mobile shooting and mobile live broadcasting are more convenient, the advantages of high updating frequency and rapid development of lens technology and shooting algorithm of the mobile terminal can be utilized, and the live video shooting quality is improved. The personal computer can acquire the live video shot by the mobile terminal through a short-distance communication technology, and the acquired live video is processed by utilizing the advantages of large memory capacity and high operation speed of the personal computer, namely, a virtual object corresponding to a specified object in the live video is superposed in the live video so as to enhance the live expressive force.
Moreover, the live broadcast method provided by the embodiment of the application can be realized through the mobile terminal and the personal computer, the software upgrading and the function extension are very convenient, and more application scenes can be compatible, such as a video conference system and the like can be accessed in a conventional live broadcast scene.
In addition, in the embodiment of the application, the personal computer can detect and identify the designated object through the target identification model and automatically call the virtual object corresponding to the designated object for superposition, so that the dependence degree of the live broadcast process on workers can be reduced, and the burden of the workers is reduced.
Fig. 3 is a schematic structural diagram of a live broadcast apparatus according to an embodiment of the present application. The apparatus is applied to a personal computer which may be the personal computer 14 described above in the embodiment of fig. 1. Referring to fig. 3, the apparatus may include: a first acquisition module 301, a detection identification module 302, a second acquisition module 303, a superposition module 304 and a sending module 305.
A first obtaining module 301, configured to obtain a live video captured by the mobile terminal 12 as a first live video through a short-distance communication technology;
a detection and identification module 302, configured to perform target detection and identification on a first live video;
a second obtaining module 303, configured to, if the specified object is identified in the first live video, obtain a virtual object corresponding to the specified object;
the overlaying module 304 is configured to overlay the virtual object to the first live video to obtain a second live video;
a sending module 305, configured to send the second live video to the live server.
Optionally, the short-range communication technology comprises at least one of bluetooth technology, zigbee technology, wireless fidelity technology, serial bus technology.
Optionally, the designated object is a designated scene or a designated object, and the virtual object is an augmented reality model.
Optionally, the detection and identification module 302 is configured to:
inputting each frame of video image in the first direct-playing video into a target identification model, and outputting the position of each detection frame in each frame of video image and the category of a specified object in each detection frame by the target identification model; the position of each detection frame is determined as the position of the designated object within each detection frame.
Optionally, the apparatus further comprises: the neural network training module is used for:
the method comprises the steps of obtaining a plurality of training samples, wherein each training sample in the plurality of training samples comprises a sample image and a sample mark, the sample image comprises a specified object, and the sample mark is the category of the specified object contained in the sample image; and training the neural network model by using a plurality of training samples to obtain a target recognition model.
Optionally, the second obtaining module 303 is configured to:
and if the specified object is identified in the first live video, acquiring a virtual object corresponding to the specified object according to at least one of the type of the specified object and the position of the specified object in the first live video.
Optionally, the superposition module 304 is configured to:
and according to the position of the specified object in the first live video, overlapping the virtual object to the first live video to obtain a second live video.
Optionally, the superposition module 304 is configured to:
according to the position of the designated object in the first live video, the virtual object and the first live video are displayed in an overlapping mode; if an adjusting instruction aiming at the displayed virtual object is detected, adjusting at least one of the size and the position of the virtual object when the virtual object is overlapped with the first live video according to the adjusting instruction; and if the confirmation instruction is detected, taking the first live video overlaid with the virtual object as a second live video.
In the embodiment of the application, live video shooting is carried out through the mobile terminal, mobile shooting and mobile live broadcasting are more convenient, the advantages of high updating frequency and rapid development of lens technology and shooting algorithm of the mobile terminal can be utilized, and the live video shooting quality is improved. The personal computer can acquire the live video shot by the mobile terminal through a short-distance communication technology, and the acquired live video is processed by utilizing the advantages of large memory capacity and high operation speed of the personal computer, namely, a virtual object corresponding to a specified object in the live video is superposed in the live video so as to enhance the expressive force of live broadcasting.
It should be noted that: in the live broadcasting device provided in the foregoing embodiment, only the division of the functional modules is exemplified when live broadcasting is performed, and in practical applications, the function distribution may be completed by different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
Each functional unit and module in the above embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present application.
The live broadcasting device and the live broadcasting method provided by the above embodiments belong to the same concept, and the specific working processes of the units and modules and the technical effects brought by the units and modules in the above embodiments can be referred to the method embodiment section, and are not described herein again.
Fig. 4 is a schematic structural diagram of a computer device 40 according to an embodiment of the present application. As shown in fig. 4, the computer device 40 includes: a processor 43, a memory 41 and a computer program 42 stored in the memory 41 and executable on the processor 43, the steps in the live method in the above embodiments being implemented when the processor 43 executes the computer program 42.
Computer device 40 may be a general purpose computer device or a special purpose computer device. In a specific implementation, the computer device 40 may be a desktop computer, or a notebook computer, and the embodiment of the present application does not limit the type of the computer device 40. Those skilled in the art will appreciate that the figure is merely an example of the computer device 40 and does not constitute a limitation of the computer device 40, and may include more or less components than those shown, or combine certain components, or different components, such as input output devices, network access devices, etc.
The Processor 43 may be a Central Processing Unit (CPU), and the Processor 43 may also be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or any conventional processor.
The memory 41 may be an internal storage unit of the computer device 40 in some embodiments, such as a hard disk or a memory of the computer device 40. The memory 41 may also be an external storage device of the computer device 40 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device 40. Further, the memory 41 may also include both an internal storage unit and an external storage device of the computer device 40. The memory 41 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of a computer program. The memory 41 may also be used to temporarily store data that has been output or is to be output.
An embodiment of the present application further provides a computer device, where the computer device includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of any of the various method embodiments described above when executing the computer program.
The embodiments of the present application also provide a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments.
The embodiments of the present application provide a computer program product, which when run on a computer causes the computer to perform the steps of the above-described method embodiments.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the above method embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by a processor to implement the steps of the above method embodiments. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or apparatus capable of carrying computer program code to a photographing apparatus/terminal device, a recording medium, computer Memory, ROM (Read-Only Memory), RAM (Random Access Memory), CD-ROM (Compact Disc Read-Only Memory), magnetic tape, floppy disk, optical data storage device, etc. The computer-readable storage medium referred to herein may be a non-volatile storage medium, in other words, a non-transitory storage medium.
It should be understood that all or part of the steps for implementing the above embodiments may be implemented by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The computer instructions may be stored in the computer-readable storage medium described above.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/computer device and method may be implemented in other ways. For example, the above-described apparatus/computer device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the embodiments of the present application, and they should be construed as being included in the present application.

Claims (9)

1. A live broadcast method, applied to a personal computer, the method comprising:
acquiring a live video shot by a mobile terminal through a short-distance communication technology as a first live video;
carrying out target detection and identification on the first live video;
if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object; the specified objects include: specifying an object;
overlaying the virtual object to the first live video to obtain a second live video;
sending the second live video to a live server;
if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object, including:
if a specified object is identified in the first live video, acquiring a virtual object corresponding to the specified object according to the position of the specified object in the first live video, wherein the position in the first live video comprises the middle position or the edge position of the specified object in the first live video.
2. The method of claim 1, wherein the short-range communication technology comprises at least one of bluetooth technology, zigbee technology, wireless fidelity technology, serial bus technology.
3. The method of claim 1, wherein the designated object is a designated object and the virtual object is an augmented reality model.
4. The method of claim 1, wherein the target detecting and identifying the first live video comprises:
inputting each frame of video image in the first live video into a target recognition model, and outputting the position of each detection frame in each frame of video image and the category of a specified object in each detection frame by the target recognition model;
and determining the position of each detection frame as the position of the specified object in each detection frame.
5. The method of any of claims 1-4, wherein said overlaying the virtual object to the first live video to obtain a second live video comprises:
and according to the position of the specified object in the first live video, overlapping the virtual object to the first live video to obtain a second live video.
6. The method of claim 5, wherein overlaying the virtual object to the first live video according to the position of the specified object in the first live video to obtain a second live video comprises:
displaying the virtual object and the first live video in an overlapping manner according to the position of the specified object in the first live video;
if an adjusting instruction aiming at the displayed virtual object is detected, adjusting at least one of the size and the position of the virtual object when the virtual object is overlapped with the first live video according to the adjusting instruction;
and if the confirmation instruction is detected, taking the first live video overlaid with the virtual object as a second live video.
7. A live broadcasting apparatus, applied to a personal computer, the apparatus comprising:
the first acquisition module is used for acquiring a live video shot by the mobile terminal as a first live video through a short-distance communication technology;
the detection and identification module is used for carrying out target detection and identification on the first direct-playing video;
the second acquisition module is used for acquiring a virtual object corresponding to a specified object if the specified object is identified in the first live video; the specified objects include: specifying an object;
the superposition module is used for superposing the virtual object to the first live video to obtain a second live video;
the sending module is used for sending the second live broadcast video to a live broadcast server;
if the designated object is identified in the first live video, acquiring a virtual object corresponding to the designated object, including:
if a specified object is identified in the first live video, acquiring a virtual object corresponding to the specified object according to the position of the specified object in the first live video, wherein the position in the first live video comprises the middle position or the edge position of the specified object in the first live video.
8. A computer device, characterized in that the computer device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, which computer program, when executed by the processor, implements the method according to any of claims 1 to 6.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 6.
CN202011416720.8A 2020-12-07 2020-12-07 Live broadcast method and device, computer equipment and storage medium Active CN112689151B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011416720.8A CN112689151B (en) 2020-12-07 2020-12-07 Live broadcast method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011416720.8A CN112689151B (en) 2020-12-07 2020-12-07 Live broadcast method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112689151A CN112689151A (en) 2021-04-20
CN112689151B true CN112689151B (en) 2023-04-18

Family

ID=75447452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011416720.8A Active CN112689151B (en) 2020-12-07 2020-12-07 Live broadcast method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112689151B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116456121B (en) * 2023-03-02 2023-10-31 广东互视达电子科技有限公司 Multifunctional direct seeding machine
CN117255212B (en) * 2023-11-20 2024-01-26 北京泰伯科技有限公司 Remote emergency live broadcast control method and related equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019227905A1 (en) * 2018-05-29 2019-12-05 亮风台(上海)信息科技有限公司 Method and equipment for performing remote assistance on the basis of augmented reality
CN110850983A (en) * 2019-11-13 2020-02-28 腾讯科技(深圳)有限公司 Virtual object control method and device in video live broadcast and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109963163A (en) * 2017-12-26 2019-07-02 阿里巴巴集团控股有限公司 Internet video live broadcasting method, device and electronic equipment
US10924796B2 (en) * 2018-04-10 2021-02-16 Logitech Europe S.A. System and methods for interactive filters in live streaming media
CN108924641A (en) * 2018-07-16 2018-11-30 北京达佳互联信息技术有限公司 Live broadcasting method, device and computer equipment and storage medium
CN109271553A (en) * 2018-08-31 2019-01-25 乐蜜有限公司 A kind of virtual image video broadcasting method, device, electronic equipment and storage medium
CN110234015A (en) * 2019-05-15 2019-09-13 广州视源电子科技股份有限公司 Live-broadcast control method, device, storage medium, terminal
CN110784733B (en) * 2019-11-07 2021-06-25 广州虎牙科技有限公司 Live broadcast data processing method and device, electronic equipment and readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019227905A1 (en) * 2018-05-29 2019-12-05 亮风台(上海)信息科技有限公司 Method and equipment for performing remote assistance on the basis of augmented reality
CN110850983A (en) * 2019-11-13 2020-02-28 腾讯科技(深圳)有限公司 Virtual object control method and device in video live broadcast and storage medium

Also Published As

Publication number Publication date
CN112689151A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
CN108594997B (en) Gesture skeleton construction method, device, equipment and storage medium
WO2020010979A1 (en) Method and apparatus for training model for recognizing key points of hand, and method and apparatus for recognizing key points of hand
EP3700190A1 (en) Electronic device for providing shooting mode based on virtual character and operation method thereof
CN111726536A (en) Video generation method and device, storage medium and computer equipment
US11870951B2 (en) Photographing method and terminal
US20220245859A1 (en) Data processing method and electronic device
CN110706310B (en) Image-text fusion method and device and electronic equipment
CN108762501B (en) AR display method, intelligent terminal, AR device and AR system
CN110633018A (en) Method for controlling display of large-screen equipment, mobile terminal and first system
CN112689151B (en) Live broadcast method and device, computer equipment and storage medium
CN107018316B (en) Image processing apparatus, image processing method, and storage medium
US10440307B2 (en) Image processing device, image processing method and medium
CN111541907A (en) Article display method, apparatus, device and storage medium
CN109413399A (en) Use the devices and methods therefor of depth map synthetic object
US20170150039A1 (en) Photographing device and method of controlling the same
JP2022000795A (en) Information management device
KR102144515B1 (en) Master device, slave device and control method thereof
CN112581571B (en) Control method and device for virtual image model, electronic equipment and storage medium
CN111243105A (en) Augmented reality processing method and device, storage medium and electronic equipment
CN113706440A (en) Image processing method, image processing device, computer equipment and storage medium
CN109981989B (en) Method and device for rendering image, electronic equipment and computer readable storage medium
CN112241199B (en) Interaction method and device in virtual reality scene
CN112967193A (en) Image calibration method and device, computer readable medium and electronic equipment
CN113850709A (en) Image transformation method and device
JP2017188787A (en) Imaging apparatus, image synthesizing method, and image synthesizing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Live streaming methods, devices, computer equipment, and storage media

Effective date of registration: 20231201

Granted publication date: 20230418

Pledgee: Shenzhen high tech investment and financing Company limited by guarantee

Pledgor: SHENZHEN IWIN VISUAL TECHNOLOGY Co.,Ltd.

Registration number: Y2023980068888