CN113542797A

CN113542797A - Interaction method and device in video playing and computer readable storage medium

Info

Publication number: CN113542797A
Application number: CN202010986578.4A
Authority: CN
Inventors: 刘思明
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Shenzhen Yayue Technology Co ltd
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2021-10-22

Abstract

The application provides an interaction method, an interaction device and a computer-readable storage medium in video playing, wherein the interaction method comprises the following steps: acquiring interactive content added to at least one video to be edited by a first user and corresponding display time; playing a target video and determining whether the target video is a video with interactive content; if the target video is determined to be a video with interactive content, acquiring the interactive content corresponding to the target video and the display time of the interactive content; when the target video is played to the display moment, displaying an interactive display page corresponding to the interactive content; when the voice interaction message of the second user is acquired through the voice acquisition equipment, a corresponding interaction feedback page is displayed based on the interaction content and the voice interaction message, and the scheme relates to a voice recognition technology and a natural language processing technology when the voice interaction message is processed. The scheme realizes the interaction between the user and the video in the video playing process and improves the user experience.

Description

Interaction method and device in video playing and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an interaction method and apparatus in video playing, and a computer-readable storage medium.

Background

At present, online teaching becomes an important means of infant education, and video contents such as cartoons and the like are more important media for infant learning. At present, videos provided by many online education platforms and video websites are fixed in content, and infants are only content receivers in the watching process and do not interact with each other, so that the teaching effect is poor.

Disclosure of Invention

The purpose of this application is to solve at least one of the above technical defects, and the technical solution provided by this application embodiment is as follows:

in a first aspect, an embodiment of the present application provides an interaction method in video playing, including:

acquiring interactive content and corresponding display time added to at least one video to be edited by a first user, and correspondingly storing identification information, the interactive content and the display time of each video to be edited into a preset interactive content database;

playing the target video, and determining whether the target video is a video with interactive content based on a preset interactive content database;

if the target video is determined to be a video with interactive content, acquiring the interactive content corresponding to the target video and the display time of the interactive content;

when the target video is played to the display moment, displaying an interactive display page corresponding to the interactive content;

and when the interactive message of the second user is acquired through the voice acquisition equipment, displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message.

In an optional embodiment of the present application, determining whether the target video is a video with interactive content based on a preset interactive content database includes:

acquiring identification information of a target video, and determining whether a preset interactive content database contains the identification information of the target video, wherein the preset interactive content database stores the identification information of at least one video with interactive content, corresponding interactive content and corresponding display time;

if the preset interactive content database contains identification information of the target video, determining that the target video is a video with interactive content, otherwise, determining that the target video is a video without the interactive content;

the method for acquiring the interactive content corresponding to the target video and the display time of the interactive content comprises the following steps:

and acquiring interactive content and display time corresponding to the target video from a preset interactive content database based on the identification information of the target video.

In an optional embodiment of the present application, acquiring the interactive content and the corresponding display time that are added to at least one video to be edited by a first user, and correspondingly storing identification information, the interactive content, and the display time of each video to be edited into a preset interactive content database includes:

acquiring at least one video to be edited;

when acquiring the addition triggering operation of a first user for the interactive content of any video to be edited, acquiring the display time of the interactive content corresponding to the video to be edited, and displaying an interactive content input page;

and acquiring the interactive content input by the first user through the interactive content input page, and correspondingly storing the identification information, the interactive content and the display time of any video to be edited into a preset interactive content database.

In an optional embodiment of the present application, any video to be edited is obtained by:

displaying a video retrieval frame to be edited when a video retrieval triggering operation to be edited of a first user is acquired;

when a keyword input by a first user is acquired through a video retrieval box to be edited, acquiring at least one video to be selected of which a video tag is matched with the keyword from a preset video library, and displaying the at least one video to be selected through a video list page to be selected, wherein the video tag of each video in the preset video library is acquired through a preset video tagging network model;

after the selection trigger operation of the first user for any video to be selected is acquired through the video list page to be selected, the video to be selected is determined as the video to be edited.

In an optional embodiment of the present application, the interactive content includes an interactive question and an interactive answer, and the obtaining of the interactive content input by the user through the interactive content input page includes:

the interactive question input by the first user is obtained through the interactive question input box of the interactive content input page, and the interactive answer input by the first user is obtained through the interactive answer input box of the interactive content input page.

In an optional embodiment of the present application, displaying an interactive display page corresponding to an interactive content includes:

and generating a corresponding interactive display page based on the interactive question and the interactive answer, and displaying the interactive display page.

In an optional embodiment of the present application, displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message includes:

carrying out voice recognition on the voice interaction message to obtain a corresponding text;

and performing similarity matching on the text and the text corresponding to the interactive answer, if the similarity is not less than a preset threshold value, displaying a first interactive feedback page, and otherwise, displaying a second interactive feedback page.

In a second aspect, an embodiment of the present application provides an interactive device in video playing, including:

the video editing module is used for acquiring interactive contents and corresponding display moments added by a first user to at least one video to be edited and correspondingly storing identification information, the interactive contents and the display moments of the videos to be edited into a preset interactive content database;

the video determining module with interactive content is used for playing the target video and determining whether the target video is the video with the interactive content;

the interactive content acquisition module is used for acquiring the interactive content corresponding to the target video and the display time of the interactive content if the target video is determined to be the video with the interactive content;

the interactive display page display module is used for displaying a corresponding interactive display page based on the interactive content and the voice interactive message when the target video is played to the display moment;

and the interactive feedback page display module is used for displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message when the voice interactive message of the second user is acquired through the voice acquisition equipment.

In an optional embodiment of the present application, the video determining module with interactive content is specifically configured to:

the video determination module with interactive content is further configured to:

In an optional embodiment of the present application, the video editing module is specifically configured to:

acquiring at least one video to be edited;

In an optional embodiment of the present application, the apparatus further includes a to-be-edited video obtaining module, configured to:

In an optional embodiment of the present application, the interactive content includes interactive questions and interactive answers, and the video editing module is further configured to:

In an optional embodiment of the present application, the interactive display page display module is specifically configured to:

and generating a corresponding interactive display page based on the interactive answers of the good persons of the interactive questions, and displaying the interactive display page.

In an optional embodiment of the present application, the interactive feedback page display module is specifically configured to:

In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor;

the memory has a computer program stored therein;

a processor configured to execute a computer program to implement the method provided in the embodiment of the first aspect or any optional embodiment of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program, when executed by a processor, implements the method provided in the embodiment of the first aspect or any optional embodiment of the first aspect.

In a fifth aspect, embodiments of the present application provide a computer program product or a computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device when executing implements the method provided in the embodiment of the first aspect or any optional embodiment of the first aspect.

The beneficial effect that technical scheme that this application provided brought is:

the corresponding interactive display page is displayed when the video with the interactive content is played to the display time of the interactive content, the voice interactive message input by the user is received in the process of displaying the interactive display page, and then the corresponding interactive feedback page is displayed based on the voice interactive message, so that the user knows the interactive content and inputs the voice interactive message in the process of watching the video, the interaction between the user and the video in the video playing process is realized, and the user experience is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1a is a schematic flowchart illustrating an interaction method in video playing according to an embodiment of the present disclosure;

fig. 1b is a schematic diagram of an interactive scene in video playing according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a page of a first user retrieving a video editing module through a video editing module retrieval box according to an example of the embodiment of the present application;

FIG. 3 is a schematic diagram of a training process of a default video annotation network model in an embodiment of the present application;

fig. 4 is a schematic diagram illustrating a process of labeling a video editing module by using a preset video labeling network model in the embodiment of the present application;

FIG. 5 is a flowchart illustrating a video production process with interactive content according to an embodiment of the present application;

FIG. 6 is a diagram of an interactive input page in an example of an embodiment of the present application;

FIG. 7 is a diagram of an interactive content presentation page in an example of an embodiment of the present application;

fig. 8a is a schematic flowchart illustrating a process of interacting with a user during playing a video with interactive content according to an embodiment of the present application;

FIG. 8b is a diagram illustrating a first interactive feedback page in an example of an embodiment of the present application;

FIG. 8c is a diagram of a second interactive feedback page in an example of an embodiment of the present application;

fig. 9 is a block diagram illustrating an interactive apparatus for playing video according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Fig. 1a is a schematic flowchart of an interaction method in video playing according to an embodiment of the present disclosure, where an execution subject of the method may be a terminal device, such as a mobile phone, a tablet computer, and a smart television, and the method may include:

step S101, acquiring interactive content and corresponding display time added to at least one video to be edited by a first user, and correspondingly storing identification information, the interactive content and the display time of each video to be edited into a preset interactive content database.

Specifically, the first user may be a producer of a video with interactive content, add the interactive content to the video to be edited, and determine a display time of the interactive content for interaction during subsequent playing.

And step S102, playing the target video and determining whether the target video is a video with interactive content.

The video with the interactive content can be understood as a video with corresponding interactive content, and the video with the interactive content can interact with a user watching the video through a related page in a playing process.

Specifically, when a user watches a target video by using a terminal device, the terminal device first needs to determine whether the target video is a video with interactive content, and executes a corresponding interactive operation in a subsequent playing process according to a determination result.

Step S103, if the target video is determined to be the video with the interactive content, the interactive content corresponding to the target video and the display time of the interactive content are obtained.

Specifically, if the terminal device determines that the target video is a video with interactive content, it indicates that the target video corresponds to the interactive content, and the related information of the interactive content needs to be displayed at a specific moment in the process of playing the target video. The specific moment for displaying the interactive content is a display moment of the interactive content, and the display moment may be a certain time point on the corresponding progress bar when the target video is played, for example, the display moment may be 1 minute 23 seconds of the time point on the corresponding progress bar when the target video is played. Specifically, the interactive content is generally content related to the target video content, and may be an interactive question and answer related to the target video content, and is generally displayed in a text or audio-video form. For example, a certain video with interactive content shows "eat", and the corresponding interactive content may be an interactive question and answer about "wash hands before eating meal".

And step S104, when the target video is played to the display moment, displaying an interactive display page corresponding to the interactive content.

Specifically, since the interactive content of the target video and the display time of the interactive content are obtained in the previous step, when the target video is played to the corresponding display time, an interactive display page is displayed, wherein the interactive display page displays the relevant information of the interactive content. For example, the interactive content is an interactive question and answer, and the interactive question and answer is displayed on an interactive display page in a text mode or displayed on the interactive display page in an audio-video mode.

And step S105, when the interactive message of the second user is acquired through the voice acquisition equipment, displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message.

The second user is a user watching the target video and is also an interactive participant in the playing process of the target video.

Specifically, after the interactive display page is displayed in the previous step, the second user acquires the relevant information of the interactive content through the interactive display page, then the second user sends an interactive message according to the understanding of the relevant information of the interactive content, the terminal device acquires the interactive message, generates a corresponding interactive feedback page by combining the voice interactive message with the interactive content, and displays the interactive feedback interface to the second user. The terminal device may receive an interactive message input by the second user through the interactive display page, for example, receive a text input by the second user through a keyboard, or receive a touch selection operation input by the second user through a touch screen. The terminal device may also receive the voice interaction message input by the user through a voice collecting device (e.g., a microphone).

It can be understood that, for a video with interactive content, when a second user watches the display time of the corresponding interactive content, the terminal device displays the related information of the interactive content through the interactive display page, the second user inputs a corresponding interactive message to the terminal device after knowing the related information of the interactive content, and the terminal device displays a corresponding interactive feedback page according to the interactive content and the interactive message input by the second user, that is, the interaction with the user in the playing process of the video with interactive content is realized.

According to the scheme, the corresponding interactive display page is displayed when the video with the interactive content is played to the display time of the interactive content, the voice interactive message input by the user is received in the process of displaying the interactive display page, and then the corresponding interactive feedback page is displayed based on the voice interactive message, so that the user knows the interactive content and inputs the voice interactive message in the process of watching the video, the interaction between the user and the video in the video playing process is realized, and the user experience is improved.

Fig. 1b shows a specific application scenario of the embodiment of the present application, wherein a user watches a video with interactive content through a terminal device 101, and the terminal device may have a corresponding client installed thereon, for example, a video client, a browser client, or an education client, and specifically, the terminal device 101 may be a smart television, and the smart television is connected to a backend server 102 through a network. Specifically, when a user opens a pre-installed client through the smart television, and selects a playing target video through the client, the smart television downloads and plays the corresponding video from the background server, acquires the corresponding interactive content and the display time from the background server when the video is determined to be the video with the interactive content, and displays the interactive display page to the user when the video is played to the corresponding display time. When the smart television collects interactive voice information sent by a user through a microphone of the smart television, the interactive voice information is sent to the background server, the background server performs voice recognition on the interactive voice information and feeds back a recognition result to the smart television, and the smart television displays a corresponding interactive feedback page to the user based on the recognition result and interactive content, namely interaction between the smart television and the user is completed.

and if the preset interactive content database contains the identification information of the target video, determining that the target video is the video with the interactive content, otherwise, determining that the target video is the video without the interactive content.

The preset content database is preset, interactive contents and display time corresponding to videos with the interactive contents are stored in the database in advance, and it needs to be explained that what interactive contents and display time of videos with the interactive contents are specifically stored in the preset content database can be set according to actual requirements. The preset content database can be stored in the terminal device and can also be stored in the corresponding background server, wherein interactive contents and display time corresponding to a plurality of videos with the interactive contents are stored, specifically, identification information, the interactive contents and the display time of each video with the interactive contents are stored correspondingly, that is, the corresponding interactive contents and the display time can be determined as long as the identification information is obtained. The specific construction process of the preset database will be described in detail later, and will not be described herein again.

Specifically, to determine whether the target video is a video with interactive content, it is only necessary to determine whether the target video has corresponding interactive content, in other words, it is only necessary to determine whether the interactive content database includes identification information of the target video, if so, it indicates that the target video has corresponding interactive content, that is, it is determined that the target video is a video with interactive content, and if not, it indicates that the target video does not have corresponding interactive content, that is, it is determined that the target video is a video without interactive content.

Correspondingly, acquiring the interactive content corresponding to the target video and the display time of the interactive content includes:

Specifically, for a target video which is a video with interactive content, matching is performed under the control of preset interactive content data according to identification information (namely, a video ID) of the target video, and the interactive content and the display time corresponding to the matched video ID are obtained, namely, the interactive content and the display time corresponding to the target video are obtained.

In an optional embodiment of the present application, acquiring the interactive content and the corresponding display time that are added to at least one video to be edited by a first user, and correspondingly storing identification information, the interactive content, and the display time of each video to be edited into a preset interactive content database includes: :

acquiring at least one video to be edited;

Specifically, interactive contents and corresponding display moments of a plurality of videos with interactive contents are stored in a preset interactive content database, and in the process of constructing the preset interactive content database, videos to be edited need to be acquired at first, and the videos to be edited can be selected by a first user according to requirements; then, interactive content and display time corresponding to each video to be edited need to be obtained, and the interactive content and the display time can be set by the first user according to requirements; and finally, the terminal equipment correspondingly stores the identification information, the interactive content and the display time of each video to be edited into a preset interactive content database one by one, namely the preset interactive content database is constructed. After the video to be edited and the related information are edited, the video to be edited becomes the corresponding video with the interactive content.

It is to be understood that the second user may be understood as an interactor of the video with the interactive content and the first user may be a producer of the video with the interactive content.

Specifically, for any video to be edited, when the first user sends an interactive content addition triggering operation for the video to be edited, that is, when an instruction for editing the video to be edited is sent, the terminal device displays an interactive content input page to the first user and acquires the display time of the interactive content, the terminal device can receive the interactive content input by the first user through a corresponding input frame of the interactive content input page, and the terminal device can receive the display time input by the first user through a corresponding input frame of the interactive content input page. In addition, the terminal device may further obtain the display time by, for example, playing the video to be edited by the terminal device, after displaying the interactive content input page to the first user, obtaining a time point corresponding to the first user clicking the video playing progress bar to be edited for the first time, and determining the time point as the corresponding display time.

Wherein which videos are determined to be videos to be edited may be decided by a first user who is a video producer with interactive content.

Specifically, since the interactive content is generally related to the video content to be edited, when the first user determines which videos to select as the video to be edited, the first user can select the video containing the required interactive content as the video to be edited. Then, first, a plurality of relevant videos may be obtained, and corresponding video tags (each video may have a plurality of video tags) are labeled for each video according to the content related to the video, and the video with the video tags is stored in a preset video library for the first user to select. Then, the first user sets a keyword according to the required interactive content, searches the corresponding video in a preset video library by using the keyword, and further selects the video to be edited from the searched video.

Specifically, when the terminal device acquires a video to be edited retrieval triggering operation of the first user, that is, when the first user sends a video retrieval instruction to be edited, a corresponding video retrieval frame to be edited is displayed to the first user, after a keyword input by the first user is acquired, a corresponding retrieval tag is extracted based on the keyword, the retrieval tag is compared with video tags of videos in a preset video library, and a matched video is taken as a video to be selected, generally, matching can mean that the retrieval tag is the same as the video tag. And displaying the videos to be selected to a first user through a video list to be selected so that the first user can select the required videos to be edited, and the first user can select one or more videos to be selected as the videos to be edited.

For example, as shown in fig. 2, after the first user inputs the keyword "eat" in the video to be edited search box 201, a plurality of videos to be selected 203(a video for eating, B video for eating, C video for eating) are displayed through the video to be selected list page 202 for the first user to select.

It should be noted that the video tags of the videos in the preset video library can be obtained by labeling the preset video labeling network model. As shown in FIG. 3, a preset video annotation network model is obtained by training a manually annotated sample video and a label, wherein the input of the model can be a relevant text introduction of the video. Then, as shown in fig. 4, the relevant text introduction of the video to be labeled in the preset video library is input into the preset video labeling network model to obtain the corresponding video label. The preset video annotation network model may adopt a BERT (Bidirectional Encoder representation from converters) model.

In the process of labeling the videos in the preset video library, a Machine Learning (ML) method is used, wherein the ML is a multi-domain cross subject, and relates to multi-domain subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

The production process of the video with the interactive content is further explained by an example, as shown in fig. 5, the production process may include the following steps:

(1) inputting keywords in the content of a video retrieval box to be edited;

(2) the terminal equipment converts the keyword into a corresponding retrieval tag;

(3) searching in a preset video library based on the search tag, and determining whether videos with the same video tag exist in the preset video library; if yes, carrying out the subsequent steps, and if not, directly ending the video production with the interactive content;

(4) acquiring videos (namely videos to be selected) matched with all the video tags, and displaying the videos to be selected through a video list page to be selected;

(5) selecting one video to be selected as a video to be edited, and then determining the display time of the interactive content corresponding to the video to be edited;

(6) inputting corresponding interactive contents through an interactive content input page;

(7) the terminal equipment correspondingly stores the identification information, the interactive content and the display time of the video to be edited into a preset interactive content database to finish the production of the video with the interactive content, in other words, the acquisition and storage of the interactive content and the display time of the video with the interactive content are finished;

(8) the video production with the interactive content is finished.

When the interactive content is an interactive question and answer, the interactive content input page can be provided with an interactive question input box and an interactive answer input box.

Specifically, the interactive question input by the first user can be received through the interactive question input box, and the interactive answer input by the first user is received through the interactive answer input box, so that the interactive question and the interactive answer can be displayed on the interactive display page respectively. For example, as shown in fig. 6, an interactive question input box 601 and an interactive answer input box 602 are provided in the interactive content input interface, wherein the interactive answer input box 602 is divided into a correct answer input box 6021 and an incorrect answer input box 6022, the first user inputs "child, what is to be done before eating" in the interactive question input box, "a. wash first in the correct answer input box," b. bath first, "c.xxxx," "d.xxxx" in the three incorrect answer input boxes, respectively.

It should be noted that, in another optional embodiment of the present application, the first user may also record audio and video content as the interactive content, in addition to inputting characters as the interactive content in the above scheme. Specifically, after the terminal device displays the interactive content input page, the first user starts an audio/video capture device (such as a microphone and a camera) of the terminal device to record the related interactive content, and correspondingly, when the video with the interactive content is played to the corresponding display time, the related information of the interactive content is displayed through the interactive display page, that is, the audio/video content recorded by the first user is played.

Specifically, when the interactive question input by the first user includes the interactive question and the interactive answer, only the interactive question may be displayed on the interactive content display page, or the interactive question and the interactive answer may be displayed at the same time. For example, as shown in fig. 7, the interactive content display page simultaneously displays the interactive question 701 and the interactive answer 702, and it can be seen that the interactive answers include a plurality of answers for the second user to select the correct answer, so that the complexity of the interaction is increased, and the interactive experience is improved.

In an optional embodiment of the present application, at least two option buttons are displayed on the interactive display page, accordingly, the interactive message is a trigger operation for any option button, and displaying an interactive feedback page corresponding to the interactive message includes:

acquiring answers corresponding to the option buttons for the trigger operation;

and if the answer is matched with the interactive answer, displaying a first interactive feedback page, otherwise, displaying a second interactive feedback page.

Specifically, after the interactive display page is displayed to the second user, the second user may input a corresponding interactive message through voice or touch operation.

If the corresponding interactive message is input through voice, namely the interactive message is a voice message, the terminal device acquires the voice message through a voice acquisition device (such as a microphone), performs voice recognition on the voice message, converts the voice message into a corresponding text, performs similarity matching between the text and the text corresponding to the interactive answer, displays a first interactive feedback page if the similarity is not smaller than a preset threshold, and displays a second interactive feedback page if the similarity is not smaller than the preset threshold, namely different answers input by different users can obtain different feedbacks.

In the above processing of the interactive voice message, a voice recognition Technology and a natural language processing Technology are used, and specifically, the key technologies of the voice Technology (Speech Technology) include an automatic voice recognition Technology (ASR), a voice synthesis Technology (TTS), and a voiceprint recognition Technology. The computer can listen, see, speak and feel, and the development direction of the future human-computer interaction is provided, wherein the voice becomes one of the best viewed human-computer interaction modes in the future. Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics.

If at least two option buttons are displayed on the interactive display page, the second user can input interactive messages in a touch manner, specifically, each option button can correspond to one answer, and the second user selects different answers by selecting and triggering different buttons, that is, inputs different interactive messages. Specifically, after the triggering operation of the user on any option button is acquired, the terminal device acquires an answer corresponding to the option button, compares the correct answer with the interactive answer, and displays a first interactive feedback page if the answer is matched with the interactive answer, or displays a second interactive feedback page, namely different answers input by different users can obtain different feedbacks.

The process of the second user engaging in the interaction in the video playing process with the interactive content is further described below by an example, as shown in fig. 8a, the interaction process may include the following steps:

(1) the terminal equipment plays the target video to a second user for watching;

(2) the terminal equipment determines whether the target video is a video with interactive content, if so, the subsequent steps are continuously executed, and if not, the interactive scheme is not executed;

(3) when the target video is played to the time limit of the corresponding interactive content, displaying an interactive display page, and displaying a corresponding interactive question and answer to a second user;

(4) the second user sends out answer voice (namely voice interaction message), and the terminal equipment collects the voice interaction message;

(5) the terminal equipment performs voice recognition on the answer voice to obtain a corresponding text;

(6) matching the similarity of the text corresponding to the answer voice with the text of the interactive answer, if the similarity is not smaller than a preset threshold value, judging that the answer is correct, and if the similarity is smaller than the preset threshold value, judging that the answer is wrong;

(7) if the answer is correct, displaying a first interaction feedback page, as shown in fig. 8b, showing "answer, good bar";

(8) if the answer is wrong, a second interactive feedback page is displayed, as shown in fig. 8c, showing that "answer is wrong, no answer should be XXX";

(9) and finishing the interaction.

Fig. 9 is a block diagram illustrating a structure of an interactive apparatus for playing video according to an embodiment of the present application, where the apparatus 900 may include: the video editing module 901, the video with interactive content determining module 902, the interactive content obtaining module 903, the interactive display page display module 904, and the interactive feedback page display module 905, wherein:

the video editing module 901 is configured to acquire interactive content and corresponding display time added to at least one to-be-edited video by a first user, and correspondingly store identification information, the interactive content, and the display time of each to-be-edited video in a preset interactive content database;

the video with interactive content determining module 902 is configured to play the target video and determine whether the target video is a video with interactive content;

the interactive content acquiring module 903 is configured to acquire interactive content corresponding to a target video and a display time of the interactive content if it is determined that the target video is a video with the interactive content;

the interactive display page display module 904 is configured to display an interactive display page corresponding to the interactive content when the target video is played to a display time;

the interactive feedback page display module 905 is configured to display a corresponding interactive feedback page based on the interactive content and the voice interaction message when the voice interaction message of the second user is acquired through the voice acquisition device.

acquiring at least one video to be edited;

and acquiring the interactive content input by the first user through the interactive content input page, and correspondingly storing any identification information with a video to be edited, the interactive content and the display time into a preset interactive content database.

Based on the same principle, an embodiment of the present application further provides an electronic device, where the electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method provided in any optional embodiment of the present application is implemented, and specifically, the following situations are implemented:

acquiring interactive content and corresponding display time added to at least one video to be edited by a first user, and correspondingly storing identification information, the interactive content and the display time of each video to be edited into a preset interactive content database; playing a target video and determining whether the target video is a video with interactive content; if the target video is determined to be a video with interactive content, acquiring the interactive content corresponding to the target video and the display time of the interactive content; when the target video is played to the display moment, displaying an interactive display page corresponding to the interactive content; and when the interactive message of the second user is acquired through the voice acquisition equipment, displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message.

The embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method shown in any embodiment of the present application.

It is understood that the medium may store a computer program corresponding to an interactive method in video playing.

Fig. 10 is a schematic structural diagram of an electronic device to which the embodiment of the present application is applied, and as shown in fig. 10, the electronic device 1000 shown in fig. 10 includes: a processor 1001 and a memory 1003. Where the processor 1001 is coupled to the memory 1003, such as via a bus 1002. Further, the electronic device 1000 may also include a transceiver 1004, and the electronic device 1000 may interact with other electronic devices through the transceiver 1004. It should be noted that the transceiver 1004 is not limited to one in practical application, and the structure of the electronic device 1000 is not limited to the embodiment of the present application.

The processor 1001, applied in the embodiment of the present application, can be used to implement the function of the interactive apparatus in video playing shown in fig. 9.

The processor 1001 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 1001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.

Bus 1002 may include a path that transfers information between the above components. The bus 1002 may be a PCI bus or an EISA bus, etc. The bus 1002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 10, but this is not intended to represent only one bus or type of bus.

The memory 1003 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The memory 1003 is used for storing application program codes for executing the present application, and the processor 1001 controls the execution. The processor 1001 is configured to execute application program codes stored in the memory 1003 to implement the actions of the interactive apparatus in video playing provided by the embodiment shown in fig. 9.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device realizes the following when executed:

playing a target video and determining whether the target video is a video with interactive content; if the target video is determined to be a video with interactive content, acquiring the interactive content corresponding to the target video and the display time of the interactive content; when the target video is played to the display moment, displaying an interactive display page corresponding to the interactive content; and when the interactive message of the second user is acquired through the voice acquisition equipment, displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. An interactive method in video playing, comprising:

acquiring interactive content and corresponding display time added to at least one video to be edited by a first user, and correspondingly storing identification information of each video to be edited, the interactive content and the display time into a preset interactive content database;

playing a target video, and determining whether the target video is a video with interactive content based on the preset interactive content database;

and when the voice interaction message of the second user is acquired through the voice acquisition equipment, displaying a corresponding interaction feedback page based on the interaction content and the voice interaction message.

2. The method of claim 1, wherein the determining whether the target video is a video with interactive content based on the preset interactive content database comprises:

acquiring identification information of the target video, and determining whether the preset interactive content database contains the identification information of the target video;

if the preset interactive content database contains the identification information of the target video, determining that the target video is a video with interactive content, otherwise, determining that the target video is a video without the interactive content;

the acquiring of the interactive content corresponding to the target video and the display time of the interactive content includes:

and acquiring the interactive content and the display time corresponding to the target video from the preset interactive content database based on the identification information of the target video.

3. The method according to claim 1, wherein the acquiring the interactive content and the corresponding display time added by the first user to the at least one video to be edited, and correspondingly storing the identification information, the interactive content and the display time of each video to be edited into a preset interactive content database comprises:

acquiring the at least one video to be edited;

and acquiring the interactive content input by the first user through the interactive content input page, and correspondingly storing the identification information of any video to be edited, the interactive content and the display time into the preset interactive content database.

4. The method according to claim 3, wherein any video to be edited is obtained by:

when the video retrieval triggering operation to be edited of the first user is obtained, displaying a video retrieval frame to be edited;

when a keyword input by the first user is acquired through the video retrieval box to be edited, acquiring at least one video to be selected of which the video tag is matched with the keyword from a preset video library, and displaying the at least one video to be selected through a video list page to be selected, wherein the video tag of each video in the preset video library is acquired through a preset video tagging network model;

and after the selection trigger operation of the first user for any video to be selected is acquired through the video list page to be selected, determining the video to be selected as the video to be edited.

5. The method of claim 3, wherein the interactive content comprises an interactive question and an interactive answer, and the obtaining the interactive content input by the first user through the interactive content input page comprises:

and acquiring the interactive question input by the first user through the interactive question input frame of the interactive content input page, and acquiring the interactive answer input by the first user through the interactive answer input frame of the interactive content input page.

6. The method according to claim 5, wherein the displaying an interactive display page corresponding to the interactive content comprises:

7. The method of claim 5, wherein displaying a corresponding interactive feedback page based on the interactive content and the voice interactive message comprises:

performing voice recognition on the voice interaction message to obtain a corresponding text;

8. An interactive device in video playing, comprising:

the video editing module is used for acquiring interactive contents and corresponding display moments added by a first user to at least one video to be edited and correspondingly storing identification information of each video to be edited, the interactive contents and the display moments into a preset interactive content database;

the video determining module with interactive content is used for playing a target video and determining whether the target video is a video with interactive content;

the interactive display page display module is used for displaying an interactive display page corresponding to the interactive content when the target video is played to the display moment;

9. An electronic device comprising a memory and a processor;

the memory has stored therein a computer program;

the processor for executing the computer program to implement the method of any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method of any one of claims 1 to 7.