CN106791921B - Processing method and device for live video and storage medium - Google Patents

Processing method and device for live video and storage medium Download PDF

Info

Publication number
CN106791921B
CN106791921B CN201611132307.2A CN201611132307A CN106791921B CN 106791921 B CN106791921 B CN 106791921B CN 201611132307 A CN201611132307 A CN 201611132307A CN 106791921 B CN106791921 B CN 106791921B
Authority
CN
China
Prior art keywords
preset
video
live
broadcast
live broadcast
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611132307.2A
Other languages
Chinese (zh)
Other versions
CN106791921A (en
Inventor
张勋
汤晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201611132307.2A priority Critical patent/CN106791921B/en
Publication of CN106791921A publication Critical patent/CN106791921A/en
Application granted granted Critical
Publication of CN106791921B publication Critical patent/CN106791921B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/241Operating system [OS] processes, e.g. server setup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26208Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints

Abstract

The disclosure relates to a video live broadcast processing method and device. The method comprises the following steps: when video live broadcasting is carried out, current voice information of a main broadcast is obtained; recognizing current voice information; and when the preset keywords exist in the current voice information, executing the operation corresponding to the preset keywords. This technical scheme can realize making the live video more intelligent, convenient through speech recognition, can carry out the corresponding operation that the anchor expects to go on automatically, avoids the anchor to carry out corresponding operation manually, has reduced the manual operation burden of anchor, also can avoid simultaneously because the anchor forgets to carry out corresponding operation manually and lead to a series of troubles.

Description

Processing method and device for live video and storage medium
Technical Field
The disclosure relates to the technical field of video anchor, in particular to a processing method and device for live video.
Background
Currently, when video live broadcasting is performed, the anchor often needs to perform many operations manually, such as paying attention to other anchors, sending a red packet to a certain viewer, manually turning off the video live broadcasting, and the like, which increases the operation burden of the anchor, and may cause great troubles if the anchor forgets to perform a certain operation manually, for example: if the anchor forgets to turn off the live video, privacy is revealed.
Disclosure of Invention
The embodiment of the disclosure provides a video live broadcast processing method and device. The technical scheme is as follows:
according to a first aspect of the embodiments of the present disclosure, a method for processing a live video is provided, which includes:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
and when the current voice information has a preset keyword, executing operation corresponding to the preset keyword.
In one embodiment, the preset keywords include: a preset video live broadcast ending word;
when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video end word exists in the current voice information, displaying a countdown for closing the live video;
and when the countdown is finished, closing the live video.
In one embodiment, prior to displaying the countdown to closing the live video, the method further comprises:
acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the anchor manually closes the video live broadcast in a historical time period;
and determining the countdown according to the time difference.
In one embodiment, the preset keywords include: a preset video live broadcast ending word; when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video broadcast end word exists in the current voice information, displaying prompt information for ending the live video broadcast, and judging whether feature information of the anchor broadcast exists in a current video picture of the live video broadcast;
and when the characteristic information of the anchor does not exist, closing the live video.
In one embodiment, the identifying the current speech information includes:
when the preset keyword comprises a preset live video end word, judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and when the current live broadcast information is matched with the corresponding preset live broadcast information, entering a step of identifying the current voice information.
In one embodiment, when the preset keyword exists in the current voice message, the executing an operation corresponding to the preset keyword includes:
when the preset keywords comprise other preset video live broadcast keywords, executing operations corresponding to the other video live broadcast keywords, wherein the operations corresponding to the other video live broadcast keywords comprise:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
According to a second aspect of the embodiments of the present disclosure, there is provided a processing apparatus for live video, including:
the voice acquisition module is used for acquiring the current voice information of the anchor broadcast during video live broadcast;
the processing module is used for identifying the current voice information;
and the execution module is used for executing the operation corresponding to the preset keyword when the preset keyword exists in the current voice message.
In one embodiment, the preset keywords include: a preset video live broadcast ending word;
the execution module comprises:
the display sub-module is used for displaying countdown for closing the live video when the preset live video end word exists in the current voice information;
and the live broadcast processing submodule is used for closing the live broadcast of the video when the countdown is finished.
In one embodiment, the apparatus further comprises:
the time acquisition module is used for acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the main broadcast manually closes the live video in a historical time period before displaying the countdown for closing the live video;
and the determining module is used for determining the countdown according to the time difference.
In one embodiment, the preset keywords include: a preset video live broadcast ending word;
the execution module comprises:
the characteristic processing submodule is used for displaying prompt information for finishing the live video broadcast when the preset live video broadcast finishing word exists in the current voice information, and judging whether characteristic information of the anchor exists in a current video picture of the live video broadcast or not;
and the characteristic response submodule is used for closing the live video broadcast when the characteristic information of the anchor does not exist.
In one embodiment, the processing module comprises:
the judgment submodule is used for judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information when the preset keywords comprise preset live broadcast end words, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and the recognition submodule is used for starting to recognize the current voice information when the current live broadcast information is matched with the corresponding preset live broadcast information.
In one embodiment, the execution module comprises:
the execution sub-module is used for executing the operation corresponding to the other video live broadcast keywords when the preset keywords comprise other preset video live broadcast keywords, wherein the operation corresponding to the other video live broadcast keywords comprises the following steps:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
According to a third aspect of the embodiments of the present disclosure, there is provided a processing apparatus for live video, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
and when the preset keyword exists in the current voice information, executing operation corresponding to the preset keyword.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
technical scheme provided by embodiment of the disclosure
When video live broadcasting is carried out, the current voice information of the anchor can be automatically acquired, the current voice information is identified, when preset keywords exist in the current voice information, operation corresponding to the preset keywords can be automatically executed, the video live broadcasting is more intelligent and convenient through voice identification, corresponding operation expected to be carried out by the anchor can be automatically executed, corresponding operation carried out manually by the anchor is avoided, manual operation burden of the anchor is reduced, and meanwhile, a series of troubles caused by the fact that the anchor forgets to manually carry out corresponding operation can be avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flow diagram illustrating a method of processing a live video according to an example embodiment.
Fig. 2 is a flow diagram illustrating another method of processing a live video according to an example embodiment.
Fig. 3 is a flow chart illustrating yet another method of processing a live video according to an example embodiment.
Fig. 4 is a flow chart illustrating a further method of processing a live video according to an example embodiment.
Fig. 5 is a flow chart illustrating a further method of processing a live video according to an example embodiment.
Fig. 6 is a flow chart illustrating a further method of processing a live video according to an example embodiment.
Fig. 7 is a block diagram illustrating a processing device for a live video according to an example embodiment.
Fig. 8 is a block diagram illustrating another processing device for a live video according to an example embodiment.
Fig. 9 is a block diagram illustrating yet another apparatus for processing a live video according to an example embodiment.
Fig. 10 is a block diagram illustrating yet another apparatus for processing a live video according to an example embodiment.
Fig. 11 is a block diagram illustrating yet another apparatus for processing a live video according to an example embodiment.
Fig. 12 is a block diagram illustrating yet another apparatus for processing a live video according to an example embodiment.
Fig. 13 is a block diagram illustrating a processing device suitable for live video according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Currently, when video live broadcasting is performed, the anchor often needs to perform many operations manually, such as paying attention to other anchors, sending a red packet to a certain viewer, manually turning off the video live broadcasting, and the like, which increases the operation burden of the anchor, and may cause great troubles if the anchor forgets to perform a certain operation manually, for example: if the anchor forgets to turn off the live video, privacy is revealed.
In order to solve the above technical problem, an embodiment of the present disclosure provides a live video processing method, which may be used in a live video processing program, system or device, and an execution subject corresponding to the method may be a terminal such as a mobile phone, a tablet, a Personal Computer (PC) or the like for live video, as shown in fig. 1, the method includes steps S101 to S103:
in step S101, when performing video live broadcasting, acquiring current voice information of a main broadcast;
when the current voice information of the anchor is acquired, the current voice information of the anchor can be acquired in real time, or acquired in other manners (for example, according to a preset time interval, or when certain conditions are met).
In step S102, current voice information is recognized;
when the current voice information of the anchor is acquired, the semantics of the current voice information can be analyzed and identified to detect whether preset keywords corresponding to related operations exist in the current voice information, such as whether keywords for closing live video, keywords for giving a red packet to an XX viewer, keywords for paying attention to other anchors, and the like.
In step S103, when a preset keyword exists in the current voice information, an operation corresponding to the preset keyword is performed.
When the preset keywords exist in the current voice information, the operation corresponding to the preset keywords can be automatically executed, so that the video live broadcast is more intelligent and convenient through voice recognition, the corresponding operation expected by the anchor can be automatically executed, the anchor is prevented from manually performing the corresponding operation, the manual operation burden of the anchor is reduced, and a series of troubles caused by the fact that the anchor forgets to manually perform the corresponding operation can be avoided.
As shown in fig. 2, in one embodiment, the preset keywords include: a preset video live broadcast ending word;
the step S103 shown in fig. 1 described above may include the step a1 and the step a 2:
in step a1, when a preset live video end word exists in the current voice message, displaying a countdown for closing the live video; the preset live video broadcast end word may be any vocabulary, phrase or sentence related to ending the live video broadcast, such as: can be 'I want to finish the live broadcast, baiye', 'everybody is good at night' or 'tomorrow' and the like.
In step a2, the live video is turned off when the countdown, which may or may not be fixed, ends.
When there is the live broadcast of predetermined video in this current speech information to end the word, close the count-down of live broadcast of video through showing, can indicate anchor and spectator and be about to close this live broadcast of video, and when the count-down ended, this live broadcast of self-closing video, thereby realize when carrying out the live broadcast of video, on the basis that does not need the manual live broadcast of closing video of anchor, can pass through speech recognition, this live broadcast of video is closed in intelligence and in time, thereby avoided because the anchor forgets to close live broadcast of video, and lead to the privacy to reveal, bring inconvenience for the anchor.
Additionally, before the countdown is over, if the anchor selects "cancel end live," the countdown may stop and not be displayed while the live video stops being closed.
As shown in fig. 3, in one embodiment, the preset keywords include: a preset video live broadcast ending word; before performing step a1, the method further includes step S301 and step S302:
in step S301, a time difference between a time when a preset keyword is detected in current voice information each time and a time when a video live broadcast is manually turned off by an anchor is obtained in a historical time period;
in step S302, a countdown is determined based on the time difference.
Before showing the count-down, can also acquire in historical time quantum, the time difference between the time when detecting that there is the preset keyword in current speech information at every turn and the time that the anchor manually closed the live video, and then according to the time difference at every turn, determine the count-down that accords with the live video custom of closing of the anchor to make this count-down more individualized, it is pointed more to have, more laminates the actual operation custom that the anchor closed this live video, for example: the anchor may manually turn off the live video several times during the historical time period, so that when the time difference of each time is obtained, the average value of the time differences can be obtained, and the average value is used as the countdown.
As shown in FIG. 4, in one embodiment, step S103 of FIG. 1 above may include step B1 and step B2:
in step B1, when a preset live video termination word exists in the current voice information, displaying a prompt message for terminating the live video, and determining whether there is anchor feature information in a current video frame of the live video;
the prompt information may be a text prompt, a voice prompt, or preset music for ending the live video broadcast, and is used to prompt the anchor and the audience to end the live video broadcast soon, and of course, if the anchor does not process (i.e., does not manage) the prompt information, it indicates that the anchor expects to close the live video broadcast with a high possibility, because it can continuously determine whether the feature information of the anchor exists in the current video frame.
The feature information may be an avatar of the anchor, or a feature of a certain body part of the anchor, such as a finger of the anchor, a mark (e.g., a mole) on the face of the anchor, and the like.
In step B2, when the feature information of the anchor does not exist, the live video is closed.
When a preset video live broadcast ending word exists in current voice information, it is stated that a main broadcast expects to close a video live broadcast and enters a preparation stage of ending the live broadcast, and in order to avoid inconvenience for the main broadcast due to mistaken closing of the video live broadcast, it is also required to continuously judge whether characteristic information of the main broadcast exists in a current video picture of the video live broadcast Inconvenience is brought to the anchor.
In addition, the two embodiments of automatically closing the live video can be combined.
As shown in FIG. 5, in one embodiment, step S102 shown in FIG. 1 above may include step C1 and step C2:
in step C1, when the preset keyword includes a preset live video end word, determining whether the current live broadcast information of the live video matches with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information includes: presetting at least one of live broadcast time, live broadcast duration and live broadcast video pictures, wherein the current live broadcast information comprises: the preset live broadcast video picture can be a picture presented by a termination gesture and a posture which are usually used when a main broadcast prepares to finish the video live broadcast, or a desktop and the like which are usually displayed when the main broadcast prepares to finish the video live broadcast;
in step C2, when the current live broadcast information matches with the corresponding preset live broadcast information, the method proceeds to the step of recognizing the current voice information.
Because the start time and the live broadcast duration of each live video broadcast are basically fixed, and when the live video broadcast is ready to be ended, the current live broadcast pictures are often similar or even the same, when the current voice information is identified, whether the current live broadcast information is matched with the corresponding preset live broadcast information or not can be judged, if the current live broadcast information is matched with the preset live broadcast information, the live broadcast is probably in a stage of closing the live video broadcast, and therefore, the current voice information can be identified.
As shown in fig. 6, in an embodiment, the step S103 in fig. 1 may further include a step D1:
in step D1, when the preset keyword includes a preset other video live keyword, performing an operation corresponding to the other video live keyword, where the operation corresponding to the other video live keyword includes:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
When corresponding operation is automatically executed through voice recognition, if the preset keyword is judged to comprise other preset video live broadcast keywords, the operation corresponding to the other video live broadcast keywords is automatically executed, such as automatically performing barrage, red packet generation, paying attention to other anchor broadcasts, performing operations such as connecting a microphone video with audiences (namely, the anchor broadcasts and certain audiences establish double-person video live broadcasts), and the like, so that the video live broadcasts are more intelligent and convenient through voice recognition, the corresponding operation expected to be performed by the anchor broadcasts can be automatically executed, the anchor broadcasts are prevented from manually performing corresponding operation, the manual operation burden of the anchor broadcasts is reduced, and meanwhile, a series of troubles caused by the fact that the anchor broadcasts forget to manually perform corresponding operation can also be avoided.
Corresponding to the processing method for live video provided by the embodiment of the present disclosure, an embodiment of the present disclosure further provides a processing apparatus for live video, where as shown in fig. 7, the apparatus includes:
a voice obtaining module 701 configured to obtain current voice information of a main broadcast when performing video live broadcast;
when the current voice information of the anchor is acquired, the current voice information of the anchor can be acquired in real time, or acquired in other manners (for example, according to a preset time interval, or when certain conditions are met).
A processing module 702 configured to recognize current speech information;
when the current voice information of the anchor is acquired, the semantics of the current voice information can be analyzed and identified to detect whether preset keywords corresponding to related operations exist in the current voice information, such as whether keywords for closing live video exist, keywords for giving a red packet to XX audiences, keywords for paying attention to other anchors, and the like.
The executing module 703 is configured to, when a preset keyword exists in the current voice message, execute an operation corresponding to the preset keyword.
When the preset keywords exist in the current voice information, the operation corresponding to the preset keywords can be automatically executed, so that the video live broadcast is more intelligent and convenient through voice recognition, the corresponding operation expected by the anchor can be automatically executed, the anchor is prevented from manually performing the corresponding operation, the manual operation burden of the anchor is reduced, and a series of troubles caused by the fact that the anchor forgets to manually perform the corresponding operation can be avoided.
As shown in fig. 8, in one embodiment, the preset keywords include: a preset video live broadcast ending word;
the execution module 703 in fig. 7 may include:
a display sub-module 7031 configured to display a countdown for closing the live video when a preset live video end word exists in the current voice information;
and a live broadcast processing sub-module 7032 configured to close the live video when the countdown is over.
When there is the live broadcast of predetermined video in this current speech information to end the word, close the count-down of live broadcast of video through showing, can indicate anchor and spectator and be about to close this live broadcast of video, and when the count-down ended, this live broadcast of self-closing video, thereby realize when carrying out the live broadcast of video, on the basis that does not need the manual live broadcast of closing video of anchor, can pass through speech recognition, this live broadcast of video is closed in intelligence and in time, thereby avoided because the anchor forgets to close live broadcast of video, and lead to the privacy to reveal, bring inconvenience for the anchor.
Additionally, before the countdown is over, if the anchor selects "cancel end live," the countdown may stop and not be displayed while the live video stops being closed.
As shown in fig. 9, in one embodiment, the apparatus may further include:
a time obtaining module 901 configured to obtain, before displaying a countdown for closing the live video, a time difference between a time when a preset keyword is detected to exist in current voice information each time and a time when the live video is manually closed by the anchor in a historical time period;
a determining module 902 configured to determine a countdown according to the time difference.
Before showing the count-down, can also acquire in historical time quantum, the time difference between the time when detecting that there is the preset keyword in current speech information at every turn and the time that the anchor manually closed the live video, and then according to the time difference at every turn, determine the count-down that accords with the live video custom of closing of the anchor to make this count-down more individualized, it is pointed more to have, more laminates the actual operation custom that the anchor closed this live video, for example: the anchor may manually turn off the live video several times during the historical time period, so that when the time difference of each time is obtained, the average value of the time differences can be obtained, and the average value is used as the countdown.
As shown in fig. 10, in one embodiment, the preset keywords include: a preset video live broadcast ending word;
the execution module 703 shown in fig. 7 may include:
the feature processing submodule 7033 is configured to, when a preset video live broadcast end word exists in the current voice information, display a prompt message for ending the video live broadcast, and determine whether feature information of a main broadcast exists in a current video picture of the video live broadcast;
the prompt information can be a text prompt, a voice prompt, or preset music for ending the video live broadcast, and is configured to prompt the main broadcast and the audience to end the video live broadcast soon.
The feature information may be an avatar of the anchor, or a feature of a certain body part of the anchor, such as a finger of the anchor, a mark (e.g., a mole) on the face of the anchor, and the like.
And a feature response sub-module 7034 configured to close the live video when there is no feature information of the anchor.
When a preset video live broadcast ending word exists in current voice information, it is stated that a main broadcast expects to close a video live broadcast and enters a preparation stage of ending the live broadcast, and in order to avoid inconvenience for the main broadcast due to mistaken closing of the video live broadcast, it is also required to continuously judge whether characteristic information of the main broadcast exists in a current video picture of the video live broadcast Inconvenience is brought to the anchor.
As shown in fig. 11, in one embodiment, the processing module 702 shown in fig. 7 may include:
the determining sub-module 7021 is configured to, when the preset keyword includes a preset live video end word, determine whether the current live broadcast information of the live video matches with corresponding preset live broadcast information, where the corresponding preset live broadcast information includes: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and the identifying sub-module 7022 is configured to start identifying the current voice information when the current live information matches with the corresponding preset live information.
Because the start time and the live broadcast duration of each live video broadcast are basically fixed, and when the live video broadcast is ready to be ended, the current live broadcast pictures are often similar or even the same, when the current voice information is identified, whether the current live broadcast information is matched with the corresponding preset live broadcast information or not can be judged, if the current live broadcast information is matched with the preset live broadcast information, the live broadcast is probably in a stage of closing the live video broadcast, and therefore, the current voice information can be identified.
As shown in fig. 12, in an embodiment, the execution module 703 shown in fig. 7 may further include an execution sub-module 7035:
an execution sub-module 7035, configured to, when the preset keyword includes a preset other video live keyword, execute an operation corresponding to the other video live keyword, where the operation corresponding to the other video live keyword includes:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
When corresponding operation is automatically executed through voice recognition, if the preset keyword is judged to comprise other preset video live broadcast keywords, the operation corresponding to the other video live broadcast keywords is automatically executed, such as automatically performing barrage, giving a red packet, paying attention to other anchor broadcasts, performing operations such as connecting a microphone video with audiences, and the like, so that the video live broadcast is more intelligent and convenient through the voice recognition, the corresponding operation expected to be performed by the anchor broadcasts can be automatically executed, the anchor broadcasts are prevented from manually performing corresponding operation, the manual operation burden of the anchor broadcasts is reduced, and a series of troubles caused by the fact that the anchor broadcasts forget to manually perform corresponding operation can be avoided.
According to a third aspect of the embodiments of the present disclosure, there is provided a processing apparatus for live video, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
and when the preset keyword exists in the current voice information, executing operation corresponding to the preset keyword.
The processor may be further configured to:
the preset keywords include: a preset video live broadcast ending word;
when the preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video end word exists in the current voice information, displaying a countdown for closing the live video;
and when the countdown is finished, closing the live video.
The processor may be further configured to:
before displaying the countdown to close the live video, the method further comprises:
acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the anchor manually closes the video live broadcast in a historical time period;
and determining the countdown according to the time difference.
The processor may be further configured to:
the preset keywords include: a preset video live broadcast ending word;
when the preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video broadcast end word exists in the current voice information, displaying prompt information for ending the live video broadcast, and judging whether feature information of the anchor broadcast exists in a current video picture of the live video broadcast;
and when the characteristic information of the anchor does not exist, closing the live video.
The processor may be further configured to:
the recognizing the current voice information comprises:
when the preset keyword comprises a preset live video end word, judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and when the current live broadcast information is matched with the corresponding preset live broadcast information, entering a step of identifying the current voice information so as to judge whether preset keywords exist in the current voice information.
The processor may be further configured to:
when the preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset keywords comprise other preset video live broadcast keywords, executing operations corresponding to the other video live broadcast keywords, wherein the operations corresponding to the other video live broadcast keywords comprise:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
Fig. 13 is a block diagram illustrating a processing apparatus 1300 for live video according to an exemplary embodiment, which is suitable for a terminal device. For example, apparatus 1300 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 13, the apparatus 1300 may include one or at least two of the following components: a processing component 1302, a memory 1304, a power component 1306, a multimedia component 1308, an audio component 1310, an input/output (I/O) interface 1312, a sensor component 1314, and a communications component 1316.
The processing component 1302 generally controls overall operation of the device 1300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 1302 may include one or at least two processors 1320 to execute instructions to perform all or part of the steps of the method described above. Further, the processing component 1302 can include one or at least two modules that facilitate interaction between the processing component 1302 and other components. For example, the processing component 1302 may include a multimedia module to facilitate interaction between the multimedia component 1308 and the processing component 1302.
The memory 1304 is configured to store various types of data to support operations at the apparatus 1300. Examples of such data include instructions for any stored object or method operating on the device 1300, contact user data, phonebook data, messages, pictures, videos, and so forth. The memory 1304 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power supply component 1306 provides power to the various components of device 1300. The power components 1306 may include a power management system, one or at least two power supplies, and other components associated with generating, managing, and distributing power supplies for the apparatus 1300.
The multimedia component 1308 includes a screen between the device 1300 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or at least two touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1308 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the apparatus 1300 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 1310 is configured to output and/or input audio signals. For example, the audio component 1310 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 1300 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1304 or transmitted via the communication component 1316. In some embodiments, the audio component 1310 also includes a speaker for outputting audio signals.
The I/O interface 1312 provides an interface between the processing component 1302 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 1314 includes one or at least two sensors for providing various aspects of state assessment for the device 1300. For example, the sensor assembly 1314 may detect the open/closed state of the device 1300, the relative positioning of components, such as a display and keypad of the device 1300, the sensor assembly 1314 may also detect a change in the position of the device 1300 or a component of the device 1300, the presence or absence of user contact with the device 1300, orientation or acceleration/deceleration of the device 1300, and a change in the temperature of the device 1300. The sensor assembly 1314 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 1316 is configured to facilitate communications between the apparatus 1300 and other devices in a wired or wireless manner. The apparatus 1300 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1316 also includes a Near Field Communications (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 1300 may be implemented by one or at least two Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 1304 comprising instructions, executable by the processor 1320 of the apparatus 1300 to perform the method described above is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium, wherein instructions of the storage medium, when executed by a processor of the apparatus 1300, enable the apparatus 1300 to perform a video live broadcast processing method, comprising:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
and when the preset keyword exists in the current voice information, executing operation corresponding to the preset keyword.
In one embodiment, the preset keywords include: a preset video live broadcast ending word;
when the preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video end word exists in the current voice information, displaying a countdown for closing the live video;
and when the countdown is finished, closing the live video.
In one embodiment, prior to displaying the countdown to closing the live video, the method further comprises:
acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the anchor manually closes the video live broadcast in a historical time period;
and determining the countdown according to the time difference.
In one embodiment, when the preset keyword exists in the current voice message, the executing an operation corresponding to the preset keyword includes:
when the preset live video broadcast end word exists in the current voice information, displaying prompt information for ending the live video broadcast, and judging whether feature information of the anchor broadcast exists in a current video picture of the live video broadcast;
and when the characteristic information of the anchor does not exist, closing the live video.
In one embodiment, the identifying the current speech information includes:
when the preset keyword comprises a preset live video end word, judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and when the current live broadcast information is matched with the corresponding preset live broadcast information, entering a step of identifying the current voice information so as to judge whether preset keywords exist in the current voice information.
In one embodiment, when the preset keyword exists in the current voice message, the executing an operation corresponding to the preset keyword includes:
when the preset keywords comprise other preset video live broadcast keywords, executing operations corresponding to the other video live broadcast keywords, wherein the operations corresponding to the other video live broadcast keywords comprise:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A processing method of live video is characterized by comprising the following steps:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
when a preset keyword exists in the current voice information, executing operation corresponding to the preset keyword;
the preset keywords include: a preset video live broadcast ending word;
when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video broadcast end word exists in the current voice information, displaying prompt information for ending the live video broadcast, and judging whether feature information of the anchor broadcast exists in a current video picture of the live video broadcast;
when the characteristic information of the anchor does not exist, closing the live video broadcast;
the recognizing the current voice information comprises:
when the preset keyword comprises a preset live video end word, judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and when the current live broadcast information is matched with the corresponding preset live broadcast information, entering a step of identifying the current voice information.
2. The method of claim 1,
the preset keywords include: a preset video live broadcast ending word;
when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video end word exists in the current voice information, displaying a countdown for closing the live video;
and when the countdown is finished, closing the live video.
3. The method of claim 2,
before displaying the countdown to close the live video, the method further comprises:
acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the anchor manually closes the video live broadcast in a historical time period;
and determining the countdown according to the time difference.
4. The method according to any one of claims 2 to 3,
when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset keywords comprise other preset video live broadcast keywords, executing operations corresponding to the other video live broadcast keywords, wherein the operations corresponding to the other video live broadcast keywords comprise:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
5. A processing apparatus for live video, comprising:
the voice acquisition module is used for acquiring the current voice information of the anchor broadcast during video live broadcast;
the processing module is used for identifying the current voice information;
the execution module is used for executing the operation corresponding to the preset keyword when the preset keyword exists in the current voice message;
the preset keywords include: a preset video live broadcast ending word;
the execution module comprises:
the characteristic processing submodule is used for displaying prompt information for finishing the live video broadcast when the preset live video broadcast finishing word exists in the current voice information, and judging whether characteristic information of the anchor exists in a current video picture of the live video broadcast or not;
the characteristic response submodule is used for closing the live video broadcast when the characteristic information of the anchor does not exist;
the processing module comprises:
the judgment submodule is used for judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information when the preset keywords comprise preset live broadcast end words, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and the recognition submodule is used for starting to recognize the current voice information when the current live broadcast information is matched with the corresponding preset live broadcast information.
6. The apparatus of claim 5,
the preset keywords include: a preset video live broadcast ending word;
the execution module comprises:
the display sub-module is used for displaying countdown for closing the live video when the preset live video end word exists in the current voice information;
and the live broadcast processing submodule is used for closing the live broadcast of the video when the countdown is finished.
7. The apparatus of claim 6, further comprising:
the time acquisition module is used for acquiring a time difference between the time when the preset keyword exists in the current voice information every time and the time when the main broadcast manually closes the live video in a historical time period before displaying the countdown for closing the live video;
and the determining module is used for determining the countdown according to the time difference.
8. The apparatus according to any one of claims 6 to 7,
the execution module comprises:
the execution sub-module is used for executing the operation corresponding to the other video live broadcast keywords when the preset keywords comprise other preset video live broadcast keywords, wherein the operation corresponding to the other video live broadcast keywords comprises the following steps:
at least one of barrage, red envelope, attention to other anchor, and video with the audience.
9. A processing apparatus for live video, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
when video live broadcasting is carried out, current voice information of a main broadcast is obtained;
recognizing the current voice information;
when a preset keyword exists in the current voice information, executing operation corresponding to the preset keyword;
the preset keywords include: a preset video live broadcast ending word;
when a preset keyword exists in the current voice message, executing an operation corresponding to the preset keyword, wherein the operation comprises the following steps:
when the preset live video broadcast end word exists in the current voice information, displaying prompt information for ending the live video broadcast, and judging whether feature information of the anchor broadcast exists in a current video picture of the live video broadcast;
when the characteristic information of the anchor does not exist, closing the live video broadcast;
the recognizing the current voice information comprises:
when the preset keyword comprises a preset live video end word, judging whether the current live broadcast information of the live video is matched with corresponding preset live broadcast information, wherein the corresponding preset live broadcast information comprises: at least one of preset live broadcast time, preset live broadcast duration and preset live broadcast video pictures;
and when the current live broadcast information is matched with the corresponding preset live broadcast information, entering a step of identifying the current voice information.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
CN201611132307.2A 2016-12-09 2016-12-09 Processing method and device for live video and storage medium Active CN106791921B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611132307.2A CN106791921B (en) 2016-12-09 2016-12-09 Processing method and device for live video and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611132307.2A CN106791921B (en) 2016-12-09 2016-12-09 Processing method and device for live video and storage medium

Publications (2)

Publication Number Publication Date
CN106791921A CN106791921A (en) 2017-05-31
CN106791921B true CN106791921B (en) 2020-03-03

Family

ID=58875811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611132307.2A Active CN106791921B (en) 2016-12-09 2016-12-09 Processing method and device for live video and storage medium

Country Status (1)

Country Link
CN (1) CN106791921B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108259983A (en) * 2017-12-29 2018-07-06 广州市百果园信息技术有限公司 A kind of method of video image processing, computer readable storage medium and terminal
CN108924662B (en) * 2018-07-25 2022-02-08 武汉斗鱼网络科技有限公司 Continuous microphone interaction method, device, equipment and storage medium
CN109195016B (en) * 2018-09-13 2020-12-15 苏州思必驰信息科技有限公司 Intelligent terminal equipment-oriented voice interaction method and terminal system for video barrage and intelligent terminal equipment
CN109168091B (en) * 2018-09-29 2021-07-30 武汉斗鱼网络科技有限公司 Method, device, equipment and storage medium for connecting wheat in live broadcast room
CN111385595B (en) * 2018-12-29 2022-05-31 阿里巴巴集团控股有限公司 Network live broadcast method, live broadcast replenishment processing method and device, live broadcast server and terminal equipment
CN110324648B (en) * 2019-07-17 2021-08-06 咪咕文化科技有限公司 Live broadcast display method and system
CN110446115B (en) * 2019-07-22 2021-10-15 腾讯科技(深圳)有限公司 Live broadcast interaction method and device, electronic equipment and storage medium
CN110784751B (en) * 2019-08-21 2024-03-15 腾讯科技(深圳)有限公司 Information display method and device
CN113038174B (en) * 2019-12-09 2021-12-21 上海幻电信息科技有限公司 Live video interaction method and device and computer equipment
CN111986700A (en) * 2020-08-28 2020-11-24 广州繁星互娱信息科技有限公司 Method, device, equipment and storage medium for triggering non-contact operation
CN112911324B (en) * 2021-01-29 2022-10-28 北京达佳互联信息技术有限公司 Content display method and device for live broadcast room, server and storage medium
CN115002500B (en) * 2022-06-15 2024-02-13 北京搜房科技发展有限公司 Live broadcast management method and device, storage medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004228707A (en) * 2003-01-20 2004-08-12 Ntt Data Corp Contents providing system
CN105141817A (en) * 2015-09-20 2015-12-09 成都宇珩智能家居科技有限公司 Computer camera with absence prompting effect for network anchor

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104010154B (en) * 2013-02-27 2019-03-08 联想(北京)有限公司 Information processing method and electronic equipment
CN104363519B (en) * 2014-11-21 2017-12-15 广州华多网络科技有限公司 It is a kind of based on online live method for information display, relevant apparatus and system
CN105100455A (en) * 2015-07-06 2015-11-25 珠海格力电器股份有限公司 Method and device for answering incoming phone call via voice control
CN105141818A (en) * 2015-09-20 2015-12-09 成都宇珩智能家居科技有限公司 Computer camera with sound control for network anchor
CN105204743A (en) * 2015-09-28 2015-12-30 百度在线网络技术(北京)有限公司 Interaction control method and device for speech and video communication
CN105573620A (en) * 2015-12-10 2016-05-11 广东欧珀移动通信有限公司 User terminal control method and user terminal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004228707A (en) * 2003-01-20 2004-08-12 Ntt Data Corp Contents providing system
CN105141817A (en) * 2015-09-20 2015-12-09 成都宇珩智能家居科技有限公司 Computer camera with absence prompting effect for network anchor

Also Published As

Publication number Publication date
CN106791921A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN106791921B (en) Processing method and device for live video and storage medium
US10452890B2 (en) Fingerprint template input method, device and medium
CN105159640B (en) Display interface rotating method and device and mobile terminal
US9800666B2 (en) Method and client terminal for remote assistance
EP3136793A1 (en) Method and apparatus for awakening electronic device
CN106941624B (en) Processing method and device for network video trial viewing
US10230891B2 (en) Method, device and medium of photography prompts
CN105898032B (en) method and device for adjusting prompt tone
EP3731088B1 (en) Method and device for displaying information and storage medium
EP3024211B1 (en) Method and device for announcing voice call
CN109324846B (en) Application display method and device and storage medium
KR101735755B1 (en) Method and apparatus for prompting device connection
WO2017008400A1 (en) Method and device for controlling intelligent device
WO2016155304A1 (en) Wireless access point control method and device
CN108806714B (en) Method and device for adjusting volume
US20180035154A1 (en) Method, Apparatus, and Storage Medium for Sharing Video
CN105898573B (en) Multimedia file playing method and device
CN111063354B (en) Man-machine interaction method and device
JP2017530657A (en) Call interface display method and apparatus
EP3147802A1 (en) Method and apparatus for processing information
EP3048508A1 (en) Methods, apparatuses and devices for transmitting data
US20170155604A1 (en) Method and device for processing information
WO2017197777A1 (en) Detection method and apparatus
CN106603381B (en) Method and device for processing chat information
CN112291631A (en) Information acquisition method, device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant