KR20180020859A

KR20180020859A - Interaction method and apparatus applicable for the video broadcast

Info

Publication number: KR20180020859A
Application number: KR1020170018356A
Authority: KR
Inventors: 윈펑 하오
Original assignee: 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드
Priority date: 2016-08-19
Filing date: 2017-02-09
Publication date: 2018-02-28
Also published as: JP2018029325A; KR101945920B1; CN106303658B; JP6629774B2; CN106303658A

Abstract

The present invention provides an interaction method and apparatus applicable for a video broadcast. According to a specific embodiment of the present invention, the method comprises the steps of: receiving a broadcast video transmitted from a broadcast host client, wherein the broadcast video is recorded and generated in real time by the broadcast host client and the broadcast video includes video streams and audio streams; performing voice recognition for the audio streams to acquire keywords; determining interaction commands corresponding to the keywords; and transmitting the broadcast video and the interaction commands to a user client to display interaction targets corresponding to the broadcast video and the interaction commands on a broadcast interface of the user client. The present invention simplifies operation of a broadcast host in the interactions between the broadcast host and the user on the one hand and does not need stopping current broadcast contents, thereby maintaining a smooth video broadcast on the other hand.

Description

TECHNICAL FIELD [0001] The present invention relates to an interactive method and apparatus for video broadcasting,

본 출원은 컴퓨터 분야에 관한 것으로, 구체적으로, 네트워크 기술 영역에 관한 것이고, 특히는, 비디오 방송에 적용되는 인터랙션 방법 및 장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to the field of computers and, more specifically, to the field of network technology, and more particularly to an interaction method and apparatus applied to video broadcasting.

비디오 방송에 있어서, 방송 진행자(Broadcasting Jockey)는 사용자와 인터랙션을 진행하여야 한다. 현재, 방송 진행자와 사용자간의 인터랙션은 방송 진행자가 인위적으로 진행하여야 한다. 예를 들어, 사용자가 선물한 가상의 선물에 대하여 감사를 표시하여야 할 경우, 방송 진행자는 현재 방송 중인 내용을 중단하고 텍스트 및/또는 이미지를 입력하여 사용자와의 인터랙션을 진행하여야 한다. 한편, 방송 진행자와 사용자의 인터랙션이 상대적으로 번거롭고, 다른 한편, 방송 진행자가 사용자와 인터랙션을 진행하여야 할 경우, 현재 방송 중인 내용을 중단하여야 하므로 원활한 방송에 영향을 미치게 된다.In a video broadcast, a broadcasting jockey must perform an interaction with a user. At present, the interaction between a broadcast operator and a user must be artificially carried out by the broadcast operator. For example, if an auditor should be presented with respect to a virtual present presented by the user, the broadcast operator should interrupt the current broadcast contents and enter text and / or images to interact with the user. On the other hand, if the broadcast moderator and the user are relatively busy to interact with each other and the broadcast moderator has to proceed with the interaction with the user, the content currently being broadcast must be stopped, thereby affecting smooth broadcasting.

본 출원은 상기 배경기술 부분에 기재된 기술적 문제를 해결하도록 비디오 방송에 적용되는 인터랙션 방법 및 장치를 제공하고자 한다.The present application is intended to provide an interaction method and apparatus applied to video broadcasting to solve the technical problems described in the background section.

제1 양태에 있어서, 본 출원은 비디오 방송에 적용되는 인터랙션 방법을 제공하고, 해당 방법은, 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하되, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하는 단계와, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하는 단계와, 키워드에 대응되는 인터랙션 명령어를 확정하는 단계와, 사용자 클라이언트의 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시하도록, 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송하는 단계를 포함한다.In a first aspect, the present application provides an interaction method applied to video broadcasting, the method comprising: receiving broadcast video sent from a broadcast host client, wherein broadcast video is recorded and generated in real time by a broadcast host client , The broadcast video including a video stream and an audio stream; acquiring a keyword by performing speech recognition on the audio stream; determining an interaction command corresponding to the keyword; And sending the broadcast video and interaction commands to the user client to display the interaction object corresponding to the video and the interaction command.

제2 양태에 있어서, 본 출원은 비디오 방송에 적용되는 인터랙션 방법을 제공하고, 해당 방법은, 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신하되, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하며, 인터랙션 명령어는 서버가 오디오 스트림에 대해 음성 인식을 진행하여 획득한 키워드에 의해 확정되는 단계와, 인터랙션 명령어에 대응되는 인터랙션 대상을 확정하는 단계와, 방송 비디오에 인터랙션 대상을 표시하는 단계를 포함한다.In a second aspect, the present application provides an interaction method applied to a video broadcast, the method comprising: receiving broadcast video and interaction commands sent from a server, wherein broadcast video is recorded in real time by a broadcast host client, Wherein the broadcast video includes a video stream and an audio stream, wherein the interaction command is determined by a keyword obtained by a server performing speech recognition on an audio stream, determining an interaction target corresponding to the interaction command, , And displaying the interaction target on the broadcast video.

제3 양태에 있어서, 본 출원은 비디오 방송에 적용되는 인터랙션 장치를 포함하고, 해당 장치는, 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하되, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하도록 구성된 방송 비디오 수신 유닛, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하도록 구성된 인식 유닛, 키워드에 대응되는 인터랙션 명령어를 확정하도록 구성된 확정 유닛, 및 사용자 클라이언트의 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시하도록, 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송하도록 구성된 송신 유닛을 포함한다.In a third aspect, the present application includes an interaction device applied to video broadcasting, the device receiving broadcast video sent from a broadcast host client, wherein the broadcast video is recorded and generated in real time by a broadcast host client Broadcast video comprises a broadcast video receiving unit configured to include a video stream and an audio stream, a recognition unit configured to proceed with speech recognition for the audio stream to obtain a keyword, a confirmation unit configured to determine an interaction command corresponding to the keyword, And a transmitting unit configured to send broadcast video and interaction commands to the user client to display the broadcast video and the interaction target corresponding to the interaction command on the client's broadcast interface.

제4 양태에 있어서, 본 출원은 비디오 방송에 적용되는 인터랙션 장치를 포함하고, 해당 장치는, 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신하되, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하며, 인터랙션 명령어는 서버가 오디오 스트림에 대해 음성 인식을 진행하여 획득한 키워드에 의해 확정되도록 구성된 수신 유닛, 인터랙션 명령어에 대응되는 인터랙션 대상을 확정하도록 구성된 인터랙션 대상 확정 유닛, 및 방송 인터페이스에 방송 비디오 및 인터랙션 대상을 표시하도록 구성된 표시 유닛을 포함한다.In a fourth aspect, the present application includes an interaction device applied to video broadcasting, the device receiving broadcast video and interaction commands sent from a server, wherein broadcast video is recorded in real time by a broadcast host client Wherein the broadcast video includes a video stream and an audio stream, wherein the interaction command comprises a receiving unit configured to be determined by a keyword obtained by a server proceeding to voice recognition on an audio stream, and to determine an interaction target corresponding to the interaction command An interaction target confirmation unit, and a display unit configured to display the broadcast video and the interaction target on the broadcast interface.

본 출원에 제공된 비디오 방송에 적용되는 인터랙션 방법 및 장치는 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하되, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하는 단계, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하는 단계, 키워드에 대응되는 인터랙션 명령어를 확정하는 단계, 및 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송하는 단계를 거쳐 사용자 클라이언트의 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시한다. 한편, 방송 진행자와 사용자의 인터랙션에서의 방송 진행자의 조작을 단순화시키고, 다른 한편, 현재 방송 중인 내용을 중단할 필요가 없으므로 원활한 비디오 방송을 유지한다.The interaction method and apparatus applied to the video broadcast provided in the present application receives broadcast video sent from a broadcast host client, wherein the broadcast video is recorded and generated in real time by the broadcast host client, and the broadcast video includes a video stream and an audio stream A step of acquiring a keyword by proceeding to voice recognition with respect to the audio stream, a step of confirming an interaction command corresponding to the keyword, and a step of sending a broadcast video and an interaction command to a user client, And displays the interaction target corresponding to the broadcast video and the interaction command. On the other hand, since it is not necessary to simplify the operation of the broadcast host in the interaction between the broadcast host and the user and to stop the content currently being broadcast, the video broadcast is maintained smoothly.

본 출원의 기타 특징, 목적 및 장점들은 하기 도면을 참조하여 진행하는 비한정적 실시예들에 대한 상세한 설명을 통하여 더욱 명확해 질 것이다.
도 1은 본 출원의 비디오 방송에 적용되는 인터랙션 방법 또는 장치의 실시예에 적용될 수 있는 예시적인 시스템 체계구조를 나타낸다.
도 2는 본 출원에 따른 비디오 방송에 적용되는 인터랙션 방법의 일 실시예의 흐름도를 나타낸다.
도 3은 본 출원에 따른 비디오 방송에 적용되는 인터랙션 방법의 다른 일 실시예의 흐름도를 나타낸다.
도 4는 본 출원의 방송 진행자 클라이언트, 서버, 사용자 클라이언트의 일 인터랙션의 개략도를 나타낸다.
도 5는 본 출원의 비디오 방송에 적용되는 인터랙션 방법에 적합한 일 예시적인 체계구조도를 나타낸다.
도 6은 본 출원의 비디오 방송에 적용되는 인터랙션 장치의 일 실시예의 구조적 개략도를 나타낸다.
도 7은 본 출원의 비디오 방송에 적용되는 인터랙션 장치의 다른 일 실시예의 구조적 개략도를 나타낸다.
도 8은 본 출원의 실시예의 비디오 방송에 적용되는 인터랙션 장치를 구현하기에 적합한 컴퓨터 시스템의 구조적 개략도를 나타낸다.Other features, objects, and advantages of the present application will become more apparent from the following detailed description of non-limiting embodiments, which proceeds with reference to the following drawings.
1 shows an exemplary system architecture that can be applied to an embodiment of an interaction method or apparatus applied to a video broadcast of the present application.
2 shows a flowchart of an embodiment of an interaction method applied to video broadcasting according to the present application.
FIG. 3 shows a flowchart of another embodiment of an interaction method applied to video broadcasting according to the present application.
FIG. 4 shows a schematic diagram of one interaction of the broadcast host client, server, and user client of the present application.
5 shows an exemplary system architecture diagram suitable for an interaction method applied to the video broadcast of the present application.
6 shows a structural schematic diagram of an embodiment of an interaction apparatus applied to the video broadcast of the present application.
7 shows a structural schematic diagram of another embodiment of an interaction apparatus applied to the video broadcast of the present application.
FIG. 8 shows a structural schematic diagram of a computer system suitable for implementing an interaction device applied to video broadcast of an embodiment of the present application.

이하, 첨부된 도면 및 실시예들을 결합하여 본 출원에 대해 진일보 상세한 설명을 진행하기로 한다. 본 명세서에 기재된 구체적인 실시예들은 오직 관련 발명을 해석하기 위한 것일 뿐, 해당 발명을 한정하기 위한 것이 아님을 자명할 것이다. 또한, 설명의 편의를 위하여, 도면에는 오직 관련 발명에 연관되는 부분만 도시되어 있음을 자명할 것이다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Reference will now be made in detail to the present application, examples of which are illustrated in the accompanying drawings. It is to be understood that the specific embodiments described herein are for interpretation of related inventions only and are not intended to limit the invention in any way. Also, for convenience of explanation, it will be apparent that only the parts related to the related invention are shown in the drawings.

본 발명의 실시예 및 실시예의 특징들은 서로 모순되지 않는 한 상호 조합될 수 있다. 이하, 첨부된 도면을 참조하고 실시예들을 결합하여 본 출원에 대하여 상세히 설명하기로 한다.The features of the embodiments and the embodiments of the present invention can be combined with one another so long as they are not contradictory to each other. Hereinafter, the present application will be described in detail with reference to the accompanying drawings and combining the embodiments.

도 1은 본 출원의 비디오 방송에 적용되는 인터랙션 방법 또는 장치의 실시예에 적용될 수 있는 예시적인 시스템 체계구조(100)를 나타낸다.1 illustrates an exemplary system architecture 100 that may be applied to an embodiment of an interaction method or apparatus applied to a video broadcast of the present application.

도 1에 도시된 바와 같이, 시스템 체계구조(100)는 방송 진행자 클라이언트(101), 서버(102) 및 사용자 클라이언트(103)를 포함할 수 있다.As shown in FIG. 1, the system architecture 100 may include a broadcast host client 101, a server 102, and a user client 103.

네트워크(104)는 방송 진행자 클라이언트(101)와 서버(102) 사이에서 전송 링크를 제공하는 매체로 이용된다. 네트워크(104)는 각종 유선, 무선 전송 링크를 포함할 수 있다. 네트워크(105)는 서버(102)와 사용자 클라이언트(103) 사이에서 전송 링크를 제공하는 매체로 이용된다. 네트워크(105)는 각종 유선, 무선 전송 링크를 포함할 수 있다.The network 104 is used as a medium for providing a transmission link between the broadcast host client 101 and the server 102. [ The network 104 may include various wired, wireless transmission links. The network 105 is used as a medium for providing a transmission link between the server 102 and the user client 103. The network 105 may include various wired, wireless transmission links.

방송 진행자 클라이언트(101)의 사용자(또는 인터넷 방송 방송 진행자라고도 지칭될 수 있음)는 방송 진행자 클라이언트(101)가 위치한 단말기 상의 예컨대 카메라, 마이크로 폰 등 기기를 이용하여 방송 내용에 대응되는 이미지 및 음성을 실시간으로 수집하고, 방송 비디오를 실시간으로 녹화할 수 있다. 방송 진행자 클라이언트(101)는 실시간으로 녹화된 방송 비디오를 서버(102)에 발송할 수 있다. 서버(102)는 방송 진행자 클라이언트(101)로부터 발송된 방송 비디오를 수신하고, 방송 비디오를 사용자 클라이언트(103)에 발송할 수 있다. 사용자 클라이언트(103)는 방송 비디오를 수신한 후, 방송 비디오를 방영할 수 있다.A user of the broadcast host client 101 (also referred to as an Internet broadcast broadcast host) may use an apparatus such as a camera or a microphone on the terminal where the broadcast host client 101 is located to transmit images and audio corresponding to broadcast contents It can collect in real time, and record broadcast video in real time. The broadcast host client 101 can send the broadcast video recorded in real time to the server 102. [ The server 102 may receive the broadcast video sent from the broadcast host client 101 and send the broadcast video to the user client 103. The user client 103 can broadcast the broadcast video after receiving the broadcast video.

도 2를 참조하면, 도 2는 본 출원에 따른 비디오 방송에 적용되는 인터랙션 방법의 일 실시예의 흐름(200)을 나타낸다. 본 출원의 실시예에 제공된 비디오 방송에 적용되는 인터랙션 방법은 도 1의 서버(102)에 의해 수행될 수 있고, 따라서, 비디오 방송에 적용되는 인터랙션 장치는 서버(102)에 설치될 수 있다. 해당 방법은 아래와 같은 단계들을 포함한다.Referring to FIG. 2, FIG. 2 shows a flow 200 of an embodiment of an interaction method applied to video broadcasting according to the present application. The interaction method applied to the video broadcast provided in the embodiment of the present application can be performed by the server 102 in FIG. 1, and therefore, the interaction device applied to the video broadcast can be installed in the server 102. The method includes the following steps.

단계(201)에서, 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신한다.In step 201, broadcast video sent from the broadcast host client is received.

본 실시예에 있어서, 방송 진행자 클라이언트의 사용자(또는 인터넷 방송 방송 진행자라고도 지칭될 수 있음)가 방송 비디오를 녹화할 경우, 방송 진행자 클라이언트가 위치한 단말기의 카메라를 이용하여 방송 내용에 대응되는 이미지를 수집할 수 있고, 방송 진행자 클라이언트가 위치한 단말기의 마이크로 폰을 이용하여 음성(예를 들어, 인터넷 방송 방송 진행자의 음성)을 수집할 수 있다. 방송 진행자 클라이언트는 이미지와 음성을 수집한 후, 이미지와 음성을 인코딩하여 비디오 스트림 및 오디오 스트림을 포함한 방송 비디오를 획득할 수 있다.In the present embodiment, when a user of a broadcast host client (also referred to as an Internet broadcast broadcast host) records a broadcast video, an image corresponding to the broadcast content is collected using a camera of the terminal where the broadcast host client is located And can collect voice (e.g., voice of the Internet broadcasting broadcasting host) using the microphone of the terminal in which the broadcasting host client is located. The broadcast host client may acquire images and audio, and then encode the image and audio to obtain broadcast video including a video stream and an audio stream.

단계(202)에서, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득한다.In step 202, speech recognition is performed on the audio stream to acquire a keyword.

본 실시예에 있어서, 단계(201)를 거쳐 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신한 후, 방송 비디오 중의 비디오 스트림 및 오디오 스트림의 인코딩 방식에 따라 방송 비디오를 디코딩하여 방송 비디오 중의 오디오 스트림을 추출할 수 있다.In this embodiment, after receiving the broadcast video transmitted from the broadcast host client via the step 201, the broadcast video is decoded according to the encoding method of the video stream and the audio stream in the broadcast video to extract the audio stream in the broadcast video can do.

본 실시예에 있어서, 오디오 스트림을 추출한 후, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득할 수 있다. 본 실시예에 있어서, 키워드는 사용자 클라이언트의 사용자와의 인터랙션에 연관되는 단어일 수 있다. 예를 들어, 키워드는 사용자 클라이언트의 사용자가 선물한 가상의 선물에 대하여 감사를 표시하는 단어일 수 있다. 오디오 스트림에 방송 진행자 클라이언트의 사용자의 음성이 포함되고, 방송 진행자 클라이언트의 사용자가 사용자 클라이언트의 사용자로부터 선물받은 가상의 선물에 대하여 감사를 표시하는 것을 예로 들면, 오디오 스트림에 감사를 표시하는 키워드, 예를 들어 “감사합니다”에 대응되는 음성 신호가 포함되고, 오디오 스트림에 대해 음성 인식을 진행하여 해당 키워드를 획득할 수 있다.In the present embodiment, after extracting the audio stream, speech recognition may be performed on the audio stream to acquire the keyword. In this embodiment, the keyword may be a word associated with the interaction with the user of the user client. For example, the keyword may be a word indicating appreciation for a virtual gift presented by the user of the user client. For example, if the audio stream contains the user's voice of the broadcast host client and the user of the broadcast host client displays an audit of the virtual present presented by the user of the user client, Quot; Thank You " is included, and the speech recognition is performed on the audio stream to acquire the corresponding keyword.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하는 단계는, 오디오 스트림에 대해 음성 인식을 진행하여 오디오 스트림에 대응되는 문구를 획득하는 단계와, 문구에 대해 단어 분리를 진행하여 단어 집합을 획득하는 단계와, 단어 집합 중에서 기정 키워드에 매칭되는 키워드를 조회하는 단계를 포함한다.In some selectable implementations of this embodiment, the step of performing speech recognition on the audio stream to obtain a keyword may include speech recognition for the audio stream to obtain a phrase corresponding to the audio stream, Obtaining a set of words by proceeding to word separation for each keyword, and querying a keyword matching the default keyword from the set of words.

본 실시예에 있어서, 방송 진행자 클라이언트의 사용자와 사용자 클라이언트의 사용자가 비디오 방송의 인터랙션에서 자주 사용하는 단어, 예를 들어 “감사합니다”, “사랑합니다”, “플라워” 등을 사전에 기정 키워드로 설정할 수 있다. 수신된 방송 비디오 중의 오디오 스트림에 대해 음성 인식을 진행하여 오디오 스트림에 대응되는 문구를 획득할 수 있다. 다음, 문구에 대해 단어 분리를 진행하여 단어 집합을 획득할 수 있다. 해당 단어 집합에서 기정 키워드에 매칭되는 키워드를 조회할 수 있다.In the present embodiment, it is assumed that the user of the broadcasting host client and the user of the user client use words frequently used in the interaction of video broadcasting, for example, "Thank you", "I love you" Can be set. It is possible to acquire a phrase corresponding to the audio stream by proceeding to speech recognition for the audio stream in the received broadcast video. Next, the word set can be obtained by proceeding to word separation for the phrase. The keyword matching the default keyword can be retrieved from the corresponding word set.

단계(203)에서, 키워드에 대응되는 인터랙션 명령어를 확정한다.In step 203, an interaction command corresponding to the keyword is determined.

본 실시예에 있어서, 단계(202)를 거쳐 방송 비디오 중의 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득한 후, 키워드에 대응되는 인터랙션 명령어를 확정할 수 있다. 예를 들어, 오디오 스트림에 방송 진행자 클라이언트의 사용자의 음성이 포함되고, 해당 음성에 “사랑합니다”, “플라워” 등 단어에 대응되는 음성 신호가 포함될 경우, 오디오 스트림에 대해 인식을 진행하여 키워드 “사랑합니다”, “플라워”를 인식할 수 있다. 키워드 ”사랑합니다”에 대응되는 인터랙션 명령어는 사용자 클라이언트의 방송 인터페이스에 인터랙션 대상, 예를 들어 하트형 이미지를 표시하도록 트리거할 수 있다. 키워드 “플라워”에 대응되는 인터랙션 명령어는 사용자 클라이언트의 방송 인터페이스에 인터랙션 대상, 예를 들어 플라워 이미지를 표시하도록 트리거할 수 있다.In the present embodiment, after the speech recognition is performed on the audio stream in the broadcast video via the step 202 to acquire the keyword, the interaction command corresponding to the keyword can be determined. For example, when the audio stream contains the user's voice of the broadcast host client and the voice includes a voice signal corresponding to words such as " love ", " flower ", etc., I love you "and" Flower ". The interaction command corresponding to the keyword " I love you " can trigger the display of an interaction target, e.g., a heart-shaped image, on the broadcast interface of the user client. The interaction command corresponding to the keyword " flower " can trigger to display the interaction object, e.g., a flower image, on the broadcast interface of the user client.

단계(204)에서, 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송한다.At step 204, broadcast video and interaction commands are sent to the user client.

본 실시예에 있어서, 단계(203)를 거쳐 키워드에 대응되는 인터랙션 명령어를 확정한 후, 인터랙션 명령어와 방송 비디오를 사용자 클라이언트에 발송할 수 있다. 이로써, 사용자 클라이언트가 인터랙션 명령어 및 방송 비디오를 수신한 후, 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시할 수 있다.In the present embodiment, after the interaction command corresponding to the keyword is determined through step 203, the interaction command and the broadcast video can be sent to the user client. Thereby, after the user client receives the interaction command and the broadcast video, the user can display the broadcast video and the interaction target corresponding to the interaction command on the broadcast interface.

본 출원의 일부 선택 가능한 구현 방식에 있어서, 인터랙션 명령어에 대응되는 인터랙션 대상은 동영상, 이미지 및 이모티콘을 포함한다.In some selectable implementations of the present application, the interaction target corresponding to the interaction command includes a moving image, an image, and an emoticon.

본 실시예에 있어서, 단계(203)를 거쳐 키워드에 대응되는 인터랙션 명령어를 확정한 후, 인터랙션 명령어 및 방송 비디오를 사용자 클라이언트에 발송할 수 있다. 이로써, 사용자 클라이언트가 인터랙션 명령어 및 방송 비디오를 수신한 후, 방송 비디오에 인터랙션 명령어에 대응되는 동영상, 이미지 및 이모티콘을 표시할 수 있다. 방송 진행자 클라이언트의 사용자는 동영상, 이미지 및 이모티콘을 이용하여 사용자 클라이언트의 사용자와 인터랙션을 진행할 수 있다.In the present embodiment, after the interaction command corresponding to the keyword is determined through step 203, the interaction command and the broadcast video can be sent to the user client. Thus, after the user client receives the interaction command and the broadcast video, the user can display the video, the image, and the emoticon corresponding to the interaction command in the broadcast video. The user of the broadcasting host client can interact with the user of the user client using the moving image, the image and the emoticon.

본 출원의 일부 선택 가능한 구현 방식에 있어서, 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 확정하는 단계와, 시점을 포함한 타임 스탬프 정보를 생성하는 단계와, 타임 스탬프 정보를 사용자 클라이언트에 발송하는 단계를 더 포함한다.In some selectable implementations of the present application, the method further comprises: determining a point in time when a voice signal corresponding to a keyword appears in the broadcast video; generating timestamp information including a time point; .

본 실시예에 있어서, 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하면서, 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 확정할 수 있다. 해당 시점을 포함한 타임 스탬프 정보를 생성하여 사용자 클라이언트에 발송할 수 있다. 이로써, 사용자 클라이언트가 인터랙션 명령어 및 방송 비디오를 수신할 경우, 해당 타임 스탬프 정보에 따라 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 확정하고, 방송 인터페이스 중의 방송 비디오 중 해당 시점에 대응되는 비디오 프레임에 인터랙션 명령어에 대응되는 인터랙션 대상을 중첩시켜 표시할 수 있다.In the present embodiment, it is possible to determine the point in time when the audio signal corresponding to the keyword appears on the broadcast video, while acquiring the keyword by performing speech recognition on the audio stream. Time stamp information including the time point can be generated and sent to the user client. Thus, when a user client receives an interaction command and broadcast video, a time point at which a voice signal corresponding to the keyword appears on the broadcast video is determined according to the time stamp information, and a video frame The interaction target corresponding to the interaction command can be superimposed and displayed.

도 3을 참조하면, 도 3은 본 출원에 따른 비디오 방송에 적용되는 인터랙션 방법의 다른 일 실시예의 흐름(300)을 나타낸다. 본 출원의 실시예에 제공된 비디오 방송에 적용되는 인터랙션 방법은 도 1의 사용자 클라이언트(103)에 의해 수행될 수 있고, 따라서, 비디오 방송에 적용되는 인터랙션 장치는 사용자 클라이언트(103)에 설치될 수 있다. 해당 방법은 아래와 같은 단계들을 포함한다.Referring to FIG. 3, FIG. 3 shows a flow 300 of another embodiment of an interaction method applied to video broadcasting according to the present application. The interaction method applied to the video broadcast provided in the embodiment of the present application can be performed by the user client 103 in Fig. 1, and thus the interaction device applied to the video broadcast can be installed in the user client 103 . The method includes the following steps.

단계(301)에서, 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신한다.In step 301, broadcast video and interaction commands sent from the server are received.

본 실시예에 있어서, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함한다.In this embodiment, the broadcast video is recorded and generated in real time by the broadcast host client, and the broadcast video includes the video stream and the audio stream.

본 실시예에 있어서, 사용자 클라이언트를 이용하여 비디오 방송을 관람할 경우, 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신할 수 있다. 인터랙션 명령어는 서버가 방송 비디오 중의 오디오 스트림에 대해 음성 인식을 진행하여 획득한 키워드에 기반하여 확정될 수 있다.In this embodiment, when viewing a video broadcast using a user client, the broadcast video and the interaction command sent from the server can be received. The interaction command may be established based on the keyword obtained by the server proceeding to voice recognition of the audio stream in the broadcast video.

예를 들어, 서버는 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하고 디코딩하여 방송 비디오 중의 오디오 스트림을 추출할 수 있다. 오디오 스트림을 추출한 후, 서버는 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득할 수 있다. 오디오 스트림에 방송 진행자 클라이언트의 사용자의 음성이 포함되고, 방송 진행자 클라이언트의 사용자가 사용자 클라이언트의 사용자로부터 선물받은 가상의 선물에 대하여 감사를 표시하는 것을 예로 들면, 오디오 스트림에 감사를 표시하는예를 들어 “감사합니다”와 같은 키워드에 대응되는 음성 신호가 포함되고, 서버는 오디오 스트림에 대해 음성 인식을 진행하여 해당 키워드를 획득한다. 이때, 서버로부터 발송된 해당 키워드에 대응되는 인터랙션 명령어를 수신할 수 있다.For example, the server may receive and decode the broadcast video sent from the broadcast host client to extract the audio stream in the broadcast video. After extracting the audio stream, the server can acquire the keyword by proceeding to voice recognition of the audio stream. For example, if the audio stream contains the user's voice of the broadcast host client and the user of the broadcast host client displays an audit of the virtual present presented by the user of the user client, The audio signal corresponding to a keyword such as " Thank you " is included, and the server proceeds to speech recognition on the audio stream to acquire the keyword. At this time, an interaction command corresponding to the keyword sent from the server can be received.

단계(302)에서, 인터랙션 명령어에 대응되는 인터랙션 대상을 확정한다.In step 302, the interaction target corresponding to the interaction command is determined.

본 실시예에 있어서, 단계(301)를 거쳐 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신한 후, 인터랙션 명령어에 대응되는 인터랙션 대상을 확정할 수 있다.In the present embodiment, after receiving the broadcast video and the interaction command sent from the server through the step 301, the interaction target corresponding to the interaction command can be determined.

예를 들어, 방송 비디오의 오디오 스트림 중의 방송 진행자 클라이언트의 사용자의 음성에 “감사합니다”, “사랑합니다”와 같은 키워드가 포함될 경우, “감사합니다”, “사랑합니다”는 각각 하나의 인터랙션 명령어에 대응되고, 각각의 인터랙션 명령어는 하나의 인터랙션 대상에 대응된다.For example, when a user of a broadcast host client in an audio stream of broadcast video includes a keyword such as "Thank you" or "I love you", "Thank you" and "I love you" And each of the interaction commands corresponds to one interaction object.

본 실시예에 있어서, 인터랙션 명령어에 대응되는 인터랙션 대상은 동영상, 이미지 및 이모티콘을 포함하나, 이에 한정되지 않는다.In the present embodiment, the interaction target corresponding to the interaction command includes, but is not limited to, a moving image, an image, and an emoticon.

단계(303)에서, 방송 인터페이스에 방송 비디오 및 인터랙션 대상을 표시한다.In step 303, the broadcast video and the interaction target are displayed on the broadcast interface.

본 실시예에 있어서, 단계(302)를 거쳐 인터랙션 명령어에 대응되는 인터랙션 대상을 확정한 후, 방송 비디오에 인터랙션 대상을 표시할 수 있다.In this embodiment, after the interaction target corresponding to the interaction command is determined through the step 302, the interaction target can be displayed on the broadcast video.

방송 비디오의 오디오 스트림 중의 방송 진행자 클라이언트의 사용자의 음성에 “감사합니다”, “사랑합니다”와 같은 키워드가 포함될 경우, 즉, 방송 진행자 클라이언트의 사용자가 비디오 방송에서 “감사합니다”, “사랑합니다”를 말할 경우, “감사합니다”, “사랑합니다”에 대응되는 인터랙션 명령어를 수신할 수 있다. 인터랙션 명령어에 대응되는 인터랙션 대상, 예를 들어, 동영상, 이미지 및 이모티콘을 확정할 수 있다. 방송 인터페이스에 “감사합니다”, “사랑합니다”에 대응되는 인터랙션 대상을 표시할 수 있다. 즉, 방송 비디오에 “감사합니다”, “사랑합니다”에 대응되는 동영상, 이미지 및 이모티콘을 중첩시켜 표시할 수 있다.If the user of the broadcast host client in the audio stream of the broadcast video includes a keyword such as "Thank you" or "I love you", that is, if the user of the broadcast host client has "Thank you" , It is possible to receive an interaction command corresponding to "Thank you" and "I love you". For example, a moving image, an image, and an emoticon corresponding to an interaction command. It is possible to display the interaction target corresponding to "Thank you" and "I love you" on the broadcast interface. That is, a moving image corresponding to " Thank you ", " I love you ", and images and emoticons can be superimposed on the broadcast video.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 포함하고 서버로부터 발송된 타임 스탬프 정보를 수신하여 상기 시점에서 방송 인터페이스에 인터랙션 대상을 표시하는 단계를 더 포함한다.In some selectable implementations of this embodiment, the step of displaying the interaction object on the broadcast interface at the time of receiving the time stamp information sent from the server, including the time when the voice signal corresponding to the keyword appears in the broadcast video .

본 실시예에 있어서, 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 포함하고 서버로부터 발송된 타임 스탬프 정보를 수신한다. 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점에 근거하여, 방송 비디오 중의 해당 시점에 대응되는 비디오 프레임에 인터랙션 대상을 중첩시켜 표시할 수 있다.In the present embodiment, time stamp information transmitted from the server is received including the time point at which the audio signal corresponding to the keyword appears in the broadcast video. It is possible to superimpose and display the interaction target on the video frame corresponding to the corresponding time point in the broadcast video based on the time point at which the audio signal corresponding to the keyword appears in the broadcast video.

도 4를 참조하면, 도 4는 본 출원의 방송 진행자 클라이언트, 서버, 사용자 클라이언트의 일 인터랙션의 개략도를 나타낸다.Referring to FIG. 4, FIG. 4 shows a schematic diagram of one interaction of a broadcast host client, a server, and a user client of the present application.

방송 진행자 클라이언트는 이미지 및 음성을 수집하고, 방송 비디오를 녹화한다. 방송 클라이언트는 방송 내용에 대응되는 이미지 및 음성을 실시간으로 수집하고, 방송 비디오를 실시간으로 녹화한다.The broadcast host client collects images and voices, and records broadcast video. The broadcast client collects images and sounds corresponding to broadcast contents in real time, and records broadcast video in real time.

방송 진행자 클라이언트는 방송 비디오를 서버에 발송한다.The broadcast host client sends broadcast video to the server.

서버는 방송 비디오로부터 오디오를 추출하고, 방송 비디오 중의 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하며, 키워드에 대응되는 인터랙션 명령어를 확정한다. 각 키워드는 하나의 인터랙션 명령어에 대응되고, 각 인터랙션 명령어는 하나의 인터랙션 대상에 대응된다.The server extracts audio from the broadcast video, performs speech recognition on the audio stream in the broadcast video to acquire the keyword, and determines the interaction command corresponding to the keyword. Each keyword corresponds to one interaction command, and each interaction command corresponds to one interaction object.

서버는 인터랙션 명령어 및 방송 비디오를 사용자 클라이언트에 발송한다.The server sends interaction commands and broadcast video to the user client.

사용자 클라이언트는 방송 비디오 및 인터랙션 명령어를 표시한다. 사용자 클라이언트는 방송 인터페이스에서 방송 비디오를 방영하고, 방송 비디오에 인터랙션 명령어에 대응되는 인터랙션 대상을 표시한다.The user client displays broadcast video and interaction commands. The user client broadcasts the broadcast video on the broadcast interface and displays the interaction object corresponding to the interaction command on the broadcast video.

본 실시예에 있어서, 방송 진행자 클라이언트의 사용자가 인터넷 방송을 진행할 경우, 방송 진행자의 음성에 대해 인식을 진행하여 인터랙션 명령어를 획득하고, 사용자 클라이언트에서 방송 비디오를 방영하면서, 인터랙션 명령어에 대응되는 인터랙션 대상을 표시한다. 이로써, 방송 진행자 클라이언트의 사용자는 방송 내용을 중단하지 않고서 사용자 클라이언트의 사용자와 인터랙션을 진행할 수 있다. 예를 들어, 방송 진행자 클라이언트의 사용자가 비디오 방송에서 “감사합니다”, “사랑합니다”를 말할 경우, 사용자 클라이언트의 방송 인터페이스에 “감사합니다”, “사랑합니다”에 대응되는 동영상, 이미지 및 이모티콘을 표시할 수 있다.In the present embodiment, when a user of a broadcast host client performs an Internet broadcast, it recognizes an audio of a broadcast host and acquires an interaction command. While broadcasting a broadcast video from a user client, . Thereby, the user of the broadcast host client can perform the interaction with the user of the user client without interrupting the broadcasting contents. For example, when a user of a broadcast host client speaks "Thank you" or "I love you" in a video broadcast, the user can be informed of the video, image and emoticon corresponding to "Thank you" Can be displayed.

도 5를 참조하면, 도 5는 본 출원의 비디오 방송에 적용되는 인터랙션 방법에 적합한 일 예시적인 체계구조도를 나타낸다.5, there is shown an exemplary system architecture diagram suitable for an interaction method applied to video broadcast of the present application.

도 5에는 방송 클라이언트 시스템, 방송 서버 시스템이 도시된다. 방송 클라이언트 시스템은 오디오 비디오 수집 모듈 및 인터랙션 표시 모듈을 포함한다. 오디오 비디오 수집 모듈은 방송 진행자 클라이언트에 배치될 수 있고, 방송 진행자 클라이언트에서 오디오 비디오 정보, 즉 방송 내용에 대응되는 이미지 및 음성을 수집하여 방송 서버 시스템의 오디오 비디오 수신 모듈에 발송하도록 구성될 수 있다. 인터랙션 표시 모듈은 사용자 클라이언트에 배치될 수 있고, 방송 서버 시스템의 인터랙션 처리 모듈로부터 발송된 인터랙션 명령어를 수신하고, 인터랙션 명령어에 따라 사용자 클라이언트에 인터랙션 명령어에 대응되는 인터랙션 대상을 표시하도록 구성될 수 있다. 방송 서버 시스템은 서버에 배치될 수 있고, 방송 서버 시스템은 오디오 비디오 수신 모듈, 오디오 비디오 처리 모듈, 음성 인식 모듈, 자연 언어 처리 모듈, 인터랙션 명령어 모듈, 및 인터랙션 처리 모듈을 포함한다. 오디오 비디오 수신 모듈은 방송 클라이언트에 의해 수집된 오디오 비디오 정보를 수신하고, 수신한 오디오 비디오 정보를 오디오 비디오 처리 모듈에 발송하도록 구성될 수 있다. 오디오 비디오 처리 모듈은 오디오 비디오 정보 중의 오디오 정보를 분석해 내고, 오디오 정보를 음성 인식 모듈에 발송하도록 구성될 수 있다. 음성 인식 모듈은 오디오 정보로부터 텍스트 정보를 인식하도록 구성될 수 있다. 자연 언어 처리 모듈은 텍스트 정보에 대해 단어 분리를 진행하여 키워드 리스트를 획득하도록 구성될 수 있다. 인터랙션 처리 모듈은 인터랙션 명령어 모듈로부터 키워드 리스트 중의 키워드에 대응되는 인터랙션 명령어를 획득하고, 획득한 인터랙션 명령어를 인터랙션 표시 모듈에 발송하도록 구성될 수 있다.FIG. 5 shows a broadcast client system and a broadcast server system. The broadcast client system includes an audio video acquisition module and an interaction display module. The audio video collection module may be arranged in the broadcast host client and may be configured to collect audio and video information, that is, an image and audio corresponding to the broadcast content, from the broadcast host client and send it to the audio video receiving module of the broadcast server system. The interaction display module can be arranged in a user client, and can receive the interaction command sent from the interaction processing module of the broadcast server system, and display the interaction target corresponding to the interaction command to the user client according to the interaction command. The broadcast server system may be located in a server, and the broadcast server system includes an audio video receiving module, an audio video processing module, a voice recognition module, a natural language processing module, an interaction command module, and an interaction processing module. The audio video receiving module may be configured to receive the audio video information collected by the broadcast client and to send the received audio video information to the audio video processing module. The audio-video processing module may be configured to analyze the audio information in the audio-video information and to send the audio information to the voice recognition module. The speech recognition module may be configured to recognize textual information from the audio information. The natural language processing module can be configured to acquire a keyword list by proceeding to word separation for text information. The interaction processing module may acquire an interaction command corresponding to the keyword in the keyword list from the interaction command module and send the acquired interaction command to the interaction display module.

도 6을 참조하면, 도 6은 본 출원에 따른 비디오 방송에 적용되는 인터랙션 장치의 일 실시예의 구조적 개략도를 나타낸다. 해당 장치 실시예는 도 2에 도시된 방법 실시예에 대응된다.Referring to Fig. 6, Fig. 6 shows a structural schematic diagram of an embodiment of an interaction apparatus applied to video broadcasting according to the present application. The device embodiment corresponds to the method embodiment shown in Fig.

도 6에 도시된 바와 같이, 본 실시예의 비디오 방송에 적용되는 인터랙션 장치(600)는 방송 비디오 수신 유닛(601), 인식 유닛(602) 및 송신 유닛(603)을 포함한다. 여기서, 방송 비디오 수신 유닛(601)은 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하고, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되며, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함한다. 인식 유닛(602)은 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하도록 구성된다. 확정 유닛은 키워드에 대응되는 인터랙션 명령어를 확정하도록 구성된다. 송신 유닛(603)은 사용자 클라이언트의 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시하도록, 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송하도록 구성된다.6, the interaction apparatus 600 applied to the video broadcast of this embodiment includes a broadcast video receiving unit 601, a recognition unit 602, and a transmission unit 603. [ Here, the broadcast video receiving unit 601 receives the broadcast video transmitted from the broadcast host client, and the broadcast video is generated by being recorded in real time by the broadcast host client, and the broadcast video includes the video stream and the audio stream. The recognition unit 602 is configured to perform speech recognition on the audio stream to acquire a keyword. The confirmation unit is configured to determine an interaction command corresponding to the keyword. The transmitting unit 603 is configured to send broadcast video and interaction commands to the user client so as to display the broadcast video and the interaction object corresponding to the interaction command on the broadcast interface of the user client.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 인식 유닛(602)은, 오디오 스트림에 대해 음성 인식을 진행하여 오디오 스트림에 대응되는 문구를 획득하도록 구성된 오디오 스트림 인식 서브 유닛(미도시), 문구에 대해 단어 분리를 진행하여 단어 집합을 획득하도록 구성된 단어 분리 서브 유닛(미도시), 및 단어 집합 중에서 기정 키워드에 매칭되는 키워드를 조회하도록 구성된 조회 서브 유닛(미도시)을 포함한다.In some selectable implementations of this embodiment, the recognition unit 602 includes an audio stream recognition sub-unit (not shown) configured to proceed with speech recognition for the audio stream to obtain a phrase corresponding to the audio stream, A word separating subunit (not shown) configured to proceed with word separation to acquire a word set, and an inquiry subunit (not shown) configured to inquire a keyword matched with the default keyword from the word set.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 장치(600)는, 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점을 확정하도록 구성된 시점 확정 유닛(미도시), 시점을 포함한 타임 스탬프 정보를 생성하도록 구성된 생성 유닛(미도시), 및 타임 스탬프 정보를 사용자 클라이언트에 발송하도록 구성된 정보 송신 유닛(미도시)을 더 포함한다.In some selectable implementations of this embodiment, the apparatus 600 includes a time determination unit (not shown) configured to determine a point in time when the audio signal corresponding to the keyword appears in the broadcast video, (Not shown), and an information transmitting unit (not shown) configured to send time stamp information to the user client.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 인터랙션 대상은 동영상, 이미지 및 이모티콘을 포함한다.In some selectable implementations of this embodiment, the interaction target includes a moving image, an image, and an emoticon.

도 7을 참조하면, 도 7은 본 출원에 따른 비디오 방송에 적용되는 인터랙션 장치의 다른 일 실시예의 구조적 개략도를 나타낸다. 해당 장치의 실시예는 도 3에 도시된 방법 실시예에대응된다.Referring to Fig. 7, Fig. 7 shows a structural schematic diagram of another embodiment of an interaction apparatus applied to video broadcasting according to the present application. The embodiment of the apparatus corresponds to the method embodiment shown in Fig.

도 7에 도시된 바와 같이, 본 실시예의 비디오 방송에 적용되는 인터랙션 장치(700)는 수신 유닛(701), 인터랙션 대상 확정 유닛(702) 및 표시 유닛(703)을 포함한다. 여기서, 수신 유닛(701)은 서버로부터 발송된 방송 비디오 및 인터랙션 명령어를 수신하도록 구성되고, 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하며, 인터랙션 명령어는 서버가 오디오 스트림에 대해 음성 인식을 진행하여 획득한 키워드에 따라 확정된다. 인터랙션 대상 확정 유닛(702)은 인터랙션 명령어에 대응되는 인터랙션 대상을 확정하도록 구성된다. 표시 유닛(703)은 방송 인터페이스에 방송 비디오 및 인터랙션 대상을 표시하도록 구성된다.7, an interaction apparatus 700 applied to the video broadcast of this embodiment includes a receiving unit 701, an interaction target confirmation unit 702, and a display unit 703. [ Here, the receiving unit 701 is configured to receive the broadcast video and the interaction command sent from the server, the broadcast video being recorded and generated in real time by the broadcast host client, the broadcast video including the video stream and the audio stream, The interaction command is determined according to the keyword obtained by the server performing speech recognition on the audio stream. The interaction target determination unit 702 is configured to determine an interaction target corresponding to an interaction command. The display unit 703 is configured to display the broadcast video and the interaction target on the broadcast interface.

본 실시예의 일부 선택 가능한 구현 방식에 있어서, 장치(700)는 키워드에 대응되는 음성 신호가 방송 비디오에 나타나는 시점에서 방송 인터페이스에 인터랙션 대상을 표시하도록, 해당 시점을 포함하고 서버로부터 발송된 타임 스탬프 정보를 수신하도록 구성된 정보 수신 유닛(미도시)을 더 포함한다.In some selectable implementations of the present embodiment, the device 700 includes a time stamp information (including the time point) sent from the server to display the interaction target on the broadcast interface at the time the voice signal corresponding to the keyword appears in the broadcast video (Not shown) configured to receive the information.

도 8은 본 출원의 실시예의 비디오 방송에 적용되는 인터랙션 장치를 구현하기에 적합한 컴퓨터 시스템의 구조적 개략도를 나타낸다.FIG. 8 shows a structural schematic diagram of a computer system suitable for implementing an interaction device applied to video broadcast of an embodiment of the present application.

도 8에 도시된 바와 같이, 컴퓨터 시스템(800)은 중앙 처리 유닛(801; CPU)을 포함하며, 읽기 전용 메모리 장치(802; ROM)에 저장된 프로그램 또는 저장부(808)로부터 랜덤 액세스 메모리 장치(803; RAM)에 로딩된 프로그램에 의해 각종 적당한 동작과 처리를 실행할 수 있다. RAM(803)에는 시스템(800)을 작동하기에 필요한 각종 프로그램 및 데이터가 더 저장되어 있다. CPU(801), ROM(802) 및 RAM(803)은 버스(804)를 통해 서로 연결된다. 입력/출력(I/O) 인터페이스(805)도 버스(804)에 연결된다.8, the computer system 800 includes a central processing unit 801 (CPU) and is coupled to a program or storage 808 stored in a read-only memory device 802 (ROM) 803; RAM). The RAM 803 further stores various programs and data necessary for operating the system 800. [ The CPU 801, the ROM 802, and the RAM 803 are connected to each other via a bus 804. An input / output (I / O) interface 805 is also coupled to bus 804.

I/O 인터페이스(805)에 연결되는 부재로서, 키보드, 마우스 등을 포함하는 입력부(806)와, 예를 들어 음극선관(CRT), 액정 표시 장치(LCD) 등 및 스피커 등을 포함하는 출력부(807)와, 하드 드라이버 등을 포함하는 저장부(808)와, 예를 들어 LAN 카드, 모뎀 등의 네트워크 인터페이스 카드를 포함하는 통신부(809)가 포함된다. 통신부(809)는 인터넷과 같은 네트워크를 통해 통신처리를 실행한다. 구동부(810)도 수요에 따라 I/O 인터페이스(805)에 연결된다. 자기 디스크, 광 디스크, 광자기 디스크, 반도체 메모리 장치 등과 같은 착탈 가능한 매체(811)는 이들 매체로부터 판독된 컴퓨터 프로그램을 수요에 따라 저장부(808)에 설치하도록 수요에 따라 구동부(810)에 설치된다.The input unit 806 includes a keyboard, a mouse, and the like. The output unit 806 includes a cathode ray tube (CRT), a liquid crystal display (LCD) A storage unit 808 including a hard drive or the like and a communication unit 809 including a network interface card such as a LAN card or a modem. The communication unit 809 executes communication processing through a network such as the Internet. The driver 810 is also connected to the I / O interface 805 according to demand. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory device or the like is installed in the drive unit 810 according to the demand to install the computer program read from these media in the storage unit 808, do.

특히, 본 개시의 실시예에 의하면, 흐름도를 참조하여 설명한 상기 과정들은 컴퓨터 소프트웨어 프로그램으로 구현될 수 있다. 예를 들어, 본 개시의 실시예는 컴퓨터 프로그램 제품을 포함하고, 상기 컴퓨터 프로그램 제품은 컴퓨터 판독 가능한 매체에 유형적으로 포함된 컴퓨터 프로그램을 포함하며, 상기 컴퓨터 프로그램은 흐름도에 도시된 방법을 실행하기 위한 컴퓨터 코드를 포함한다. 이러한 실시예에 있어서, 해당 컴퓨터 프로그램은 통신부(809)를 경유하여 네트워크로부터 다운로드되어 설치될 수 있고 및/또는 착탈 가능한 매체(811)로부터 설치될 수 있다.In particular, and in accordance with the embodiments of the present disclosure, the processes described with reference to the flowcharts may be implemented in computer software programs. For example, an embodiment of the present disclosure includes a computer program product, the computer program product comprising a computer program tangibly embodied in a computer-readable medium, the computer program being executable for executing a method as shown in the flowchart Includes computer code. In this embodiment, the computer program can be downloaded and installed from the network via the communication unit 809 and / or installed from the removable medium 811. [

첨부된 도면 중의 흐름도 및 블록도는 본 출원의 각 실시예에 따른 시스템, 방법 및 컴퓨터 프로그램 제품의 구현 가능한 체계구조, 기능 및 조작을 도시하였다. 이러한 방면에서, 흐름도 또는 블록도 중의 각각은 하나의 모듈, 프로그램 세그먼트 또는 코드의 일부분을 대표할 수 있고, 상기 모듈, 프로그램 세그먼트 또는 코드의 일부분은 규정된 로직 기능을 구현하기 위한 하나 또는 다수의 실행 가능한 명령어를 포함한다. 일부 대체 구현에 있어서, 블록에 표기된 기능은 첨부된 도면에 표기된 순서와 상이한 순서로 발생할 수도 있음을 유의하여야 한다. 예를 들어, 순차적으로 표시된 두개의 블록은 실제적으로 기본상 동시에 실행될 수 있고, 경우에 따라 상반된 순서에 따라 실행될 수도 있으며, 이는 관련된 기능에 따라 확정된다. 블록도 및/또는 흐름도 중의 각 블록 및 블록도 및/또는 흐름도 중의 블록들의 조합은 규정된 기능 또는 조작을 실행하는 하드웨어 기반의 전용 시스템으로 구현되거나, 전용 하드웨어와 컴퓨터 명령의 조합으로 구현될 수 있음을 유의하여야 한다.The flowcharts and block diagrams in the accompanying drawings illustrate the system structures, functions, and operations of the systems, methods, and computer program products according to the embodiments of the present application. In this regard, each of the flowcharts or block diagrams may represent a module, a program segment, or a portion of a code, and the module, program segment, or portion of code may be implemented as one or more implementations Includes possible commands. It should be noted that, for some alternative implementations, the functions indicated in the blocks may occur in a different order than the order indicated in the accompanying drawings. For example, two blocks that are sequentially displayed may actually be executed at the same time on a basic basis, and may be executed in an opposite order in some cases, which is determined according to the related function. Each block and / or block diagram in the block diagrams and / or flowchart illustrations and / or combinations of blocks in the flowchart illustrations may be embodied in a dedicated hardware-based system that performs the specified functions or operations, or may be implemented in a combination of dedicated hardware and computer instructions .

다른 일 방면에 있어서, 본 출원은 비휘발성 컴퓨터 저장 매체를 더 제공하며, 상기 비휘발성 컴퓨터 저장 매체는 상술한 실시예 중의 상기 장치에 포함된 비휘발성 컴퓨터 저장 매체이거나, 독립적으로 존재하며 단말기 장치에 설치되지 않은 비휘발성 컴퓨터 저장 매체일 수도 있다. 상기 비휘발성 컴퓨터 저장 매체는 하나 또는 다수의 프로그램을 저장하고, 하나 또는 다수의 프로그램이 하나의 장치로 실행될 경우, 장치로 하여금 방송 진행자 클라이언트로부터 발송된 방송 비디오를 수신하되, 상기 방송 비디오는 방송 진행자 클라이언트에 의해 실시간으로 녹화되어 생성되고, 상기 방송 비디오는 비디오 스트림 및 오디오 스트림을 포함하는 단계와, 상기 오디오 스트림에 대해 음성 인식을 진행하여 키워드를 획득하는 단계와, 상기 키워드에 대응되는 인터랙션 명령어를 확정하는 단계와, 사용자 클라이언트의 방송 인터페이스에 방송 비디오와 인터랙션 명령어에 대응되는 인터랙션 대상을 표시하도록, 상기 방송 비디오 및 인터랙션 명령어를 사용자 클라이언트에 발송하는 단계를 수행하도록 한다.In another aspect, the present application further provides a non-volatile computer storage medium, wherein the non-volatile computer storage medium is a non-volatile computer storage medium contained in the apparatus of the embodiments described above, Or non-volatile computer storage media that is not installed. The non-volatile computer storage medium stores one or a plurality of programs, and when the one or more programs are executed as one apparatus, the apparatus receives the broadcast video sent from the broadcast host client, The method comprising the steps of: recording a video stream and an audio stream in real time by a client, the broadcast video including a video stream and an audio stream; acquiring a keyword by performing speech recognition on the audio stream; And transmitting the broadcast video and the interaction command to the user client so as to display the broadcast video and the interaction target corresponding to the interaction command in the broadcast interface of the user client.

이상의 설명은 오직 본 출원의 비교적 바람직한 실시예 및 활용한 기술적 원리에 대한 설명이다. 해당 기술분야의 당업자는 본 출원에 관련된 발명의 범위가 상기 기술적 특징들의 특정 조합으로 이루어진 기술적 방안들에 한정되는 것이 아니라 본 발명의 주지를 벗어나지 않고서 상기 기술적 특징들 또는 그들의 균등한 특징들의 임의의 조합으로 이루어진 기타 기술적 방안들, 예를 들어, 상기 특징을 본 출원에 개시되어 있으나 이에 한정되지 않는 유사한 기능을 구비한 기술적 특징과 서로 대체하여 이루어진 기술적 방안도 포함하고 있음을 자명할 것이다.The foregoing description is only a description of the preferred embodiments of the present application and the technical principles utilized. It will be understood by those skilled in the art that the scope of the present invention is not limited to the technical solutions made up of specific combinations of the technical features but can be applied to any combination of the technical features or their equivalent features without departing from the gist of the invention , And other technical measures, including, for example, technical features that have been replaced with technical features having similar features as those disclosed in this application, but which are not limited thereto.

Claims

1. An interaction method applied to video broadcasting,
Receiving broadcast video sent from a broadcast host client, wherein the broadcast video is recorded and generated in real time by a broadcast host client, the broadcast video including a video stream and an audio stream;
Acquiring a keyword by performing speech recognition on the audio stream;
Determining an interaction command corresponding to the keyword; And
And broadcasting the broadcast video and the interaction command to the user client to display the broadcast video and the interaction object corresponding to the interaction command on the broadcast interface of the user client.

The method according to claim 1,
Wherein the step of acquiring a keyword by performing speech recognition on the audio stream comprises:
Performing speech recognition on the audio stream to obtain a phrase corresponding to the audio stream;
Proceeding to word separation for the phrase to obtain a word set; And
And querying the keyword matching the default keyword from the word set.

3. The method of claim 2,
Determining a point in time when the audio signal corresponding to the keyword appears in the broadcast video;
Generating time stamp information including the time point; And
And sending the time stamp information to a user client.

The method of claim 3,
Wherein the interaction object includes a moving image, an image, and an emoticon.

1. An interaction method applied to video broadcasting,
Wherein the broadcast video is recorded and generated in real time by a broadcast host client and the broadcast video includes a video stream and an audio stream, A step of recognizing the speech by recognizing the keyword;
Determining an interaction object corresponding to the interaction command; And
And displaying the broadcast video and the interaction target on a broadcast interface.

6. The method of claim 5,
Further comprising the step of receiving the time stamp information transmitted from the server including the time point at which the voice signal corresponding to the keyword appears in the broadcast video and displaying the interaction target on the broadcast interface at the time point.

1. An interaction device applied to video broadcasting,
A broadcast video receiving unit configured to receive broadcast video transmitted from a broadcast host client, the broadcast video being recorded and generated in real time by a broadcast host client, the broadcast video including a video stream and an audio stream;
A recognizing unit configured to acquire a keyword by performing speech recognition on the audio stream;
A determination unit configured to determine an interaction command corresponding to the keyword; And
And a transmission unit configured to send the broadcast video and the interaction command to the user client so as to display the broadcast video and the interaction target corresponding to the interaction command in the broadcast client of the user client.

8. The method of claim 7,
Wherein the recognition unit comprises:
An audio stream recognizing subunit configured to proceed with speech recognition for the audio stream to obtain a phrase corresponding to the audio stream;
A word separation sub-unit configured to proceed with word separation for the phrase to obtain a word set; And
And an inquiry subunit configured to inquire the keyword matching the default keyword from the word set.

9. The method of claim 8,
A time determining unit configured to determine a time point at which the audio signal corresponding to the keyword appears in the broadcast video;
A generating unit configured to generate time stamp information including the time point; And
And an information transmitting unit configured to send the time stamp information to a user client.

10. The method of claim 9,
Wherein the interaction target includes a moving image, an image, and an emoticon.

1. An interaction device applied to video broadcasting,
Wherein the broadcast video is recorded and generated in real time by a broadcast host client and the broadcast video includes a video stream and an audio stream, A reception unit configured to be confirmed by a keyword obtained as a result of speech recognition for the speech recognition unit;
An interaction target determining unit configured to determine an interaction target corresponding to an interaction command; And
And a display unit configured to display the broadcast video and the interaction object on a broadcast interface.

12. The method of claim 11,
Further comprising an information receiving unit configured to receive the time stamp information transmitted from the server and to display the interaction object on the broadcast interface at the time point, the time information including a time when the voice signal corresponding to the keyword appears in the broadcast video, Interaction device.