CN109947993B

CN109947993B - Plot skipping method and device based on voice recognition and computer equipment

Info

Publication number: CN109947993B
Application number: CN201910192857.0A
Authority: CN
Inventors: 李明德; 潘星
Original assignee: Apollo Zhilian Beijing Technology Co Ltd
Current assignee: Apollo Zhilian Beijing Technology Co Ltd
Priority date: 2019-03-14
Filing date: 2019-03-14
Publication date: 2022-10-21
Anticipated expiration: 2039-03-14
Also published as: CN109947993A

Abstract

The application provides a method, a device and computer equipment for episode skip based on voice recognition, wherein the method comprises the following steps: the method comprises the steps of obtaining a query request sent by a client, wherein the query request comprises text query information generated according to voice information input by a user, extracting key words in the text query information, searching a play time node matched with the key words in a prestored plot list, and sending the play time node to the client so that the client jumps to corresponding play time according to the play time node. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

Description

Plot skipping method and device based on voice recognition and computer equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for episode skip based on speech recognition, and a computer device.

Background

With the continuous development of the internet technology, users can watch multimedia resources such as movies and televisions through intelligent equipment anytime and anywhere, for example, some users need to jump to a movie highlight or a segment that the users want to watch in the process of watching videos through manual operation, the users cannot perform one-time accurate positioning, the users need to repeatedly drag a progress bar back and forth to position, and more resources and time can be consumed repeatedly to load and buffer.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art.

The application provides a scenario jump method and device based on voice recognition and computer equipment, and aims to solve the technical problems that in the prior art, scenario jump needs to be performed through manual operation, and scenario jump efficiency and accuracy are low due to the fact that one-time accurate positioning cannot be achieved.

An embodiment of a first aspect of the present application provides a scenario jump method based on speech recognition, where the method includes the following steps:

acquiring a query request sent by a client, wherein the query request comprises text query information generated according to voice information input by a user;

extracting key words in the text query information, and searching a play time node matched with the key words in a prestored episode list;

and sending the playing time node to the client so that the client jumps to the corresponding playing time according to the playing time node.

As a first possible implementation manner in this embodiment of the present application, before the obtaining an inquiry request sent by a client, the method further includes:

acquiring a plurality of media resources, analyzing each media resource to acquire a keyword in each media resource and a corresponding playing time node;

and labeling the keywords in each media resource and the corresponding playing time node to generate the preset plot list.

As a second possible implementation manner in this embodiment of the present application, when a plurality of playing time nodes matched with the keyword are searched in a pre-stored episode list, before sending the playing time nodes to the client, the method further includes:

acquiring a selection request sent by a client, wherein the selection request comprises text selection information generated according to voice information input by a user;

and extracting a selected word in the text selection information, and searching a target playing time node matched with the selected word in a plurality of playing time nodes.

According to the episode skip method based on voice recognition, the query request sent by the client is obtained, wherein the query request comprises text query information generated according to voice information input by a user, keywords in the text query information are extracted, play time nodes matched with the keywords are searched in a prestored episode list, and the play time nodes are sent to the client, so that the client can skip to corresponding play time according to the play time nodes. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

An embodiment of a second aspect of the present application provides a scenario jump method based on speech recognition, including:

acquiring voice information input by a user, and converting the voice information into text query information;

sending the text query information to a cloud server so that the cloud server extracts key words in the text query information, searches playing time nodes matched with the key words according to a prestored plot list, and sends the playing time nodes to the client;

and acquiring the playing time node sent by the cloud server, and jumping to the corresponding playing time according to the playing time node.

As a first possible implementation manner in this embodiment of the present application, after the node jumps to the corresponding playing time according to the playing time, the method further includes:

and automatically playing the next episode and prompting the user whether to jump to the playback time node of the return history for playing.

According to the episode skipping method based on voice recognition, voice information input by a user is obtained, the voice information is converted into text query information, the text query information is sent to the cloud server, so that the cloud server extracts key words in the text query information, playing time nodes matched with the key words are searched according to a prestored episode list, the playing time nodes are sent to a client, the playing time nodes sent by the cloud server are obtained, and corresponding playing time is skipped according to the playing time nodes. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

An embodiment of a third aspect of the present application provides a cloud server, including:

the second acquisition module is used for acquiring a plurality of media resources, analyzing each media resource and acquiring a keyword and a corresponding playing time node in each media resource;

and the generating module is used for performing labeling processing on the keywords in each media resource and the corresponding playing time node to generate the preset plot list.

As a first possible implementation manner in this embodiment of the application, the cloud server further includes: the second acquisition module is used for acquiring a plurality of media resources and analyzing each media resource to acquire a keyword in each media resource and a corresponding playing time node; and the generating module is used for performing labeling processing on the keywords in each media resource and the corresponding playing time node to generate the preset plot list.

As a second possible implementation manner in this embodiment of the application, when a plurality of playing time nodes matched with the keyword are searched in a pre-stored episode list, before sending the playing time nodes to the client, the method further includes:

the first obtaining module is further configured to obtain a selection request sent by a client, where the selection request includes text selection information generated according to voice information input by a user; the extraction matching module is further configured to extract a selection word in the text selection information, and search a target playing time node matched with the selection word in a plurality of playing time nodes.

According to the cloud server, the query request sent by the client is obtained, wherein the query request comprises text query information generated according to voice information input by a user, the keywords in the text query information are extracted, the playing time nodes matched with the keywords are searched in a prestored plot list, and the playing time nodes are sent to the client, so that the client jumps to the corresponding playing time according to the playing time nodes. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

An embodiment of a fourth aspect of the present application provides a client, including:

the third acquisition module is used for acquiring voice information input by a user and converting the voice information into text query information;

the second sending module is used for sending the text query information to a cloud server so that the cloud server extracts key words in the text query information, searches playing time nodes matched with the key words according to a prestored plot list and sends the playing time nodes to the client;

and the receiving and skipping module is used for receiving the playing time node sent by the cloud server and skipping to the corresponding playing time according to the playing time node.

As a first possible implementation manner in this embodiment of the application, the client further includes: and the prompting module is used for automatically playing the next episode and prompting whether to skip the history playing time node for playing to the user.

The client side obtains the voice information input by the user, converts the voice information into text query information, and sends the text query information to the cloud server, so that the cloud server extracts keywords in the text query information, searches playing time nodes matched with the keywords according to a prestored plot list, sends the playing time nodes to the client side, obtains the playing time nodes sent by the cloud server, and jumps to corresponding playing time according to the playing time nodes. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

An embodiment of a fifth aspect of the present application provides a scenario skipping system based on speech recognition, including: the cloud server according to the third aspect and the client according to the fourth aspect.

An embodiment of a sixth aspect of the present application provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the episode skip method based on speech recognition as described in the foregoing embodiment.

An embodiment of the seventh aspect of the present application provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the episode skip method based on speech recognition as described in the foregoing embodiments.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic structural diagram of a scenario jump system based on speech recognition according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a scenario jump method based on speech recognition according to an embodiment of the present application;

fig. 3 is a schematic flowchart of another scenario jump method based on speech recognition according to an embodiment of the present application;

FIG. 4 is a schematic flowchart illustrating a scenario jump method based on speech recognition according to an embodiment of the present application;

FIG. 5 is a schematic flowchart illustrating a scenario jump method based on speech recognition according to an embodiment of the present application;

FIG. 6 is a diagram illustrating an example of a scenario jump method based on speech recognition according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a cloud server according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of another cloud server according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of a client according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of another client according to an embodiment of the present application

FIG. 11 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

Aiming at the problems that in the prior art, skipping needs to be performed through manual operation, one-time accurate positioning cannot be performed, a progress bar needs to be dragged back and forth repeatedly to perform positioning, and more resources and time are consumed repeatedly to load and buffer the progress bar, the embodiment of the application provides a scenario skipping method based on voice recognition.

Specifically, as shown in fig. 1, the episode skip system based on speech recognition includes: a client 10 and a cloud server 20.

Specifically, the client 10 receives the voice information input by the user, converts the voice information into text information, and sends the text information to the cloud server 20, the cloud server 20 extracts the keywords in the text query information, searches the prestored episode list for the playing time node matched with the keywords, and sends the playing time node to the client 10, and the client 10 jumps to the corresponding playing time according to the received playing time node.

It should be noted that, one or more cloud servers 20 may be provided, and two cloud servers may be provided to improve processing efficiency, and the pre-stored episode list is stored in another cloud server 20, so that one cloud server extracts the keywords in the text query message and sends the keywords to another cloud server 20 to search the pre-stored episode list for the playing time node matching the keywords, and send the playing time node to the client 10, thereby improving episode skipping efficiency.

The following describes a scenario jump method, apparatus and computer device based on speech recognition according to an embodiment of the present application with reference to the drawings.

Fig. 2 is a schematic flowchart of a scenario jump method based on speech recognition according to an embodiment of the present disclosure.

First, a scenario jump method based on speech recognition in the embodiment of the present application is described on a cloud server side. As shown in fig. 2, the episode jump method based on speech recognition includes the following steps:

step 101, obtaining a query request sent by a client, wherein the query request includes text query information generated according to voice information input by a user.

And 102, extracting key words in the text query information, and searching playing time nodes matched with the key words in a prestored plot list.

In practical application, when a user watches multimedia resources such as movies, the user can initiate plot skipping search through voice, and after receiving voice information, the client processes the voice information to generate a query request and sends the query request to the cloud server.

Further, the cloud server analyzes the query request to obtain text query information, extracts keywords in the text query information through a keyword extraction algorithm, or performs word segmentation processing on the text query information, and then analyzes a plurality of analyses to determine keywords, for example, "jump to hero XX to obtain a successful segment", obtain "jump", "go", "hero", "XX", "obtain", "success", "and" segment "after the word segmentation processing, and then analyzes to obtain keywords" hero "and" success ".

And finally, searching a playing time node matched with the keyword in a prestored scenario list after the keyword is determined, wherein the scenario list is generated in advance and stored in the cloud server or other cloud servers, and is directly inquired at the cloud server or inquired by sending an inquiry request to other cloud servers when needed.

As a possible implementation, how to generate the pre-stored episode list is described below in conjunction with fig. 3. Specifically, as represented in fig. 3, it comprises:

step 201, obtaining a plurality of media resources, and analyzing each media resource to obtain a keyword and a corresponding play time node in each media resource.

Step 202, labeling the keywords in each media resource and the corresponding playing time node, and storing to generate a preset plot list.

Specifically, a plurality of multimedia resources such as movie a, movie B, television C, and entertainment program D may be selected according to actual application requirements for analysis, that is, an episode segment in each multimedia resource is analyzed, a corresponding keyword is extracted for each episode segment, and a playing time node corresponding to the episode segment is extracted, so that the keyword and the corresponding playing time node in each media resource are labeled and stored in a preset episode list.

It can also be understood that, in a case where one keyword may correspond to a plurality of playing time nodes, the plurality of playing time nodes need to be provided for the user to select, and the determined target playing time node, as shown in fig. 4 in particular, includes:

step 301, acquiring a selection request sent by a client, wherein the selection request includes text selection information generated according to voice information input by a user.

Step 302, extracting the selection word in the text selection information, and searching a target playing time node matched with the selection word in a plurality of playing time nodes.

Specifically, the brief information of a plurality of episode segments corresponding to a plurality of playing time nodes can be displayed on the display interface of the client or the episode segments are displayed through an animation and other modes, so that the user sends a selection request to the cloud server through the client according to the needs,

Furthermore, the cloud server analyzes the selection request to obtain text selection information, extracts a selection word from the text selection information, finally searches a target playing time node matched with the selection word from the plurality of playing time nodes and sends the target playing time node to the client, and the accuracy of meeting the user requirement by the plot jump is improved by interacting with the user again, so that the user experience is improved.

And 103, sending the playing time node to the client so that the client jumps to the corresponding playing time according to the playing time node.

Specifically, the playing time node is sent to the client, and the client directly jumps to the playing time corresponding to the playing time node from the current playing time node according to the playing time node to play.

The episode skip method based on voice recognition obtains a query request sent by a client, wherein the query request comprises text query information generated according to voice information input by a user, key words in the text query information are extracted, a play time node matched with the key words is searched in a prestored episode list, and the play time node is sent to the client so that the client can skip to corresponding play time according to the play time node. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

Based on the foregoing embodiments, in order to more fully describe the scenario jump method based on speech recognition in the embodiments of the present application, the scenario jump method based on speech recognition in the embodiments of the present application is described below with reference to fig. 5 on the client side.

Specifically, as shown in fig. 5, the scenario jump method based on speech recognition includes:

step 401, acquiring voice information input by a user, and converting the voice information into text query information.

Specifically, when a user watches a multimedia resource, the user may initiate a scenario skip request by voice, and a microphone of the client may receive voice information of the user and convert the voice information into text query information by a voice conversion text algorithm or model.

Step 402, sending the text query information to a cloud server so that the cloud server extracts keywords in the text query information, searches for a playing time node matched with the keywords according to a prestored plot list, and sends the playing time node to the client.

Further, how to process the text query message to obtain the playing time node by the cloud server may refer to the description of the above embodiments, and will not be described in detail herein.

Step 403, obtaining a playing time node sent by the cloud server, and jumping to a corresponding playing time according to the playing time node.

Specifically, after acquiring a play time node sent by the cloud server, the client directly jumps to a play time corresponding to the play time node from the current play time node according to the play time node for playing.

It can also be understood that, after the client end finishes playing the episode corresponding to the playing time node, the next relevant episode can be automatically played, and in order to further improve the user experience, the next episode can be automatically played and the user is prompted whether to jump to the resume history playing time node for playing, that is, the user determines whether to select to return to the playing time point before jumping according to the actual application needs.

In order to make the description of the above embodiments more clear to those skilled in the art, the following is a specific scenario illustrated in conjunction with fig. 6.

Specifically, as shown in fig. 6, a user watches a certain movie through voice input, the client converts voice information into text query information and sends the text query information to the cloud server 1, then the cloud server 1 performs semantic analysis on the text query information, extracts keywords from the voice query information and sends the keywords to the cloud server 2, and the cloud server 2 searches movies matched with the keywords in a preset movie database according to the keywords and sends the movies to the client for playing.

Therefore, in the process that a user watches the movie, the user needs to jump to the wonderful scenario, the user inputs voice to search for the wonderful scenario, the client converts the voice information into text query information and then sends the text query information to the cloud server 1, then the cloud server 1 conducts semantic analysis on the text query information, the voice analysis result is extracted keywords and sent to the cloud server 3, the cloud server 3 searches a play wonderful scenario list corresponding to the keywords in a preset scenario list and sends the play wonderful scenario list to the client, the user can select the target wonderful scenario through voice and obtain a time playing node corresponding to the target wonderful scenario, and the client jumps to the corresponding time playing according to the time playing node. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

In order to implement the above embodiment, the present application further provides a cloud server.

Fig. 7 is a schematic structural diagram of a cloud server provided in an embodiment of the present application.

As shown in fig. 7, the cloud server 20 includes: a first acquisition module 201, an extraction matching module 202 and a first sending module 203.

The first obtaining module 201 is configured to obtain a query request sent by a client, where the query request includes text query information generated according to voice information input by a user.

And the extraction matching module 202 is configured to extract a keyword in the text query information, and search a play time node matched with the keyword in a pre-stored episode list.

A first sending module 203, configured to send the playing time node to the client, so that the client jumps to a corresponding playing time according to the playing time node.

As a possible implementation manner, as shown in fig. 8, on the basis of fig. 7, the cloud server further includes: a second acquisition module 204 and a generation module 205.

The second obtaining module 204 is configured to obtain multiple media resources, and analyze each media resource to obtain a keyword and a corresponding play time node in each media resource.

A generating module 205, configured to perform labeling processing on the keywords in each media resource and the corresponding playing time node to generate the preset episode list.

As another possible implementation manner, when a plurality of playing time nodes matched with the keyword are searched in a pre-stored episode list, before sending the playing time nodes to the client, the cloud server further includes:

the first obtaining module 201 is further configured to obtain a selection request sent by a client, where the selection request includes text selection information generated according to voice information input by a user.

The extraction matching module 202 is further configured to extract a selection word in the text selection information, and search a target playing time node matching the selection word from the multiple playing time nodes.

It should be noted that the foregoing explanation of the embodiment of the episode skip method based on speech recognition is also applicable to the cloud server of the embodiment, and is not repeated herein.

The episode skipping device based on voice recognition obtains a query request sent by a client, wherein the query request comprises text query information generated according to voice information input by a user, extracts key words in the text query information, searches a play time node matched with the key words in a prestored episode list, and sends the play time node to the client so that the client skips to corresponding play time according to the play time node. Therefore, the efficiency of plot skipping is improved, the skipping accuracy is improved while the resource consumption is saved, the use by a user is facilitated, and the user experience is improved.

Fig. 9 is a schematic structural diagram of a cloud server according to an embodiment of the present application.

As shown in fig. 9, the cloud server 10 includes: a third obtaining module 101, a second sending module 102 and a receiving and skipping module 103.

The third obtaining module 101 is configured to obtain voice information input by a user, and convert the voice information into text query information.

The second sending module 102 is configured to send the text query information to a cloud server, so that the cloud server extracts a keyword in the text query information, searches a play time node matched with the keyword according to a pre-stored episode list, and sends the play time node to the client.

And the receiving and skipping module 103 is configured to receive the playing time node sent by the cloud server, and skip to a corresponding playing time according to the playing time node.

As a possible implementation manner, as shown in fig. 10, on the basis of fig. 9, the method further includes: a prompt module 104.

And the prompting module 104 is used for automatically playing the next episode and prompting the user whether to jump to the resume history playing time node for playing.

It should be noted that the foregoing explanation of the embodiment of the episode skip method based on speech recognition is also applicable to the client side of the embodiment, and is not repeated here.

In order to implement the foregoing embodiments, the present application further proposes a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the episode skip method based on speech recognition as described in the foregoing embodiments.

In order to implement the above embodiments, the present application further proposes a non-transitory computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the episode skip method based on speech recognition as described in the above embodiments.

FIG. 11 illustrates a block diagram of an exemplary computer device suitable for use to implement embodiments of the present application. The computer device 12 shown in fig. 11 is only an example, and should not bring any limitation to the function and the use range of the embodiment of the present application.

As shown in FIG. 5, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. These architectures include, but are not limited to, industry Standard Architecture (ISA) bus, micro Channel Architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus, to name a few.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 11, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the embodiments described herein.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network such as the Internet) via Network adapter 20. As shown, the network adapter 20 communicates with the other modules of the computer device 12 over the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, such as implementing the voice recognition-based episode skip method mentioned in the foregoing embodiments, by running a program stored in the system memory 28.

In the description of the present specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specified otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Further, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A scenario jump method based on voice recognition is characterized in that the method is applied to a cloud server and comprises the following steps:

extracting key words in the text query information, and searching playing time nodes matched with the key words in a prestored plot list; the method comprises the steps of obtaining a plurality of media resources, analyzing each media resource to obtain a keyword corresponding to an episode segment in each media resource and a playing time node corresponding to the episode segment; labeling the keywords in each media resource and the corresponding playing time node to generate a preset plot list;

sending the playing time node to the client so that the client jumps to the corresponding playing time according to the playing time node;

when a plurality of playing time nodes matched with the keywords are searched in a pre-stored plot list, before the playing time nodes are sent to the client, the method further comprises the following steps:

displaying brief information of a plurality of plot segments corresponding to a plurality of playing time nodes or an animation display plot segment on a display interface of a client;

2. A plot skipping method based on voice recognition is characterized in that the method is applied to a client and comprises the following steps:

sending the text query information to a cloud server so that the cloud server extracts key words in the text query information, searches playing time nodes matched with the key words according to a prestored plot list, and sends the playing time nodes to the client; the method comprises the steps of obtaining a plurality of media resources, analyzing each media resource to obtain a keyword corresponding to an episode segment in each media resource and a playing time node corresponding to the episode segment; labeling the keywords in each media resource and the corresponding playing time node to generate a preset plot list;

acquiring the playing time node sent by the cloud server, and jumping to the corresponding playing time according to the playing time node;

acquiring a selection request input by a user, wherein the selection request comprises text selection information generated according to voice information input by the user;

3. The method of claim 2, wherein after the node jumps to the corresponding playtime according to the playtime, further comprising:

and automatically playing the next episode, and prompting the user whether to jump to the playback time node of the return history for playing.

4. A cloud server, comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a query request sent by a client, and the query request comprises text query information generated according to voice information input by a user;

the extraction matching module is used for extracting the key words in the text query information and searching playing time nodes matched with the key words in a prestored plot list; the method comprises the steps of obtaining a plurality of media resources, analyzing each media resource to obtain keywords corresponding to an episode segment in each media resource and playing time nodes corresponding to the episode segment; labeling the keywords in each media resource and the corresponding playing time node to generate a preset plot list;

the first sending module is used for sending the playing time node to the client so that the client jumps to the corresponding playing time according to the playing time node;

the first obtaining module is further configured to display brief information of a plurality of episode segments or an animation display episode segment corresponding to the plurality of playing time nodes on a display interface of the client, and obtain a selection request sent by the client, where the selection request includes text selection information generated according to voice information input by a user;

the extraction matching module is further configured to extract a selection word in the text selection information, and search a target playing time node matched with the selection word from the plurality of playing time nodes.

5. A client, comprising:

the second sending module is used for sending the text query information to a cloud server so that the cloud server extracts key words in the text query information, searches playing time nodes matched with the key words according to a prestored plot list and sends the playing time nodes to the client; the method comprises the steps of obtaining a plurality of media resources, analyzing each media resource to obtain a keyword corresponding to an episode segment in each media resource and a playing time node corresponding to the episode segment; labeling the keywords in each media resource and the corresponding playing time node to generate a preset plot list;

the receiving and skipping module is used for receiving the playing time node sent by the cloud server and skipping to the corresponding playing time according to the playing time node;

the third obtaining module is further configured to display brief information of a plurality of episode segments or an animation display episode segment corresponding to the plurality of playing time nodes on a display interface of the client, and obtain a selection request input by a user, where the selection request includes text selection information generated according to voice information input by the user;

and the extraction matching module is also used for extracting the selected words in the text selection information and searching target playing time nodes matched with the selected words in the plurality of playing time nodes.

6. The client of claim 5, further comprising:

and the prompting module is used for automatically playing the next episode and prompting whether to jump to the resume history playing time node for playing to the user.

7. A scenario jump system based on voice recognition, characterized by comprising the cloud server of claim 4 and the client of claim 5.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of episode hopping based on speech recognition as claimed in claim 1 when executing the program.

9. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the speech recognition-based episode skip method of claim 1.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method of episode hopping based on speech recognition as claimed in claim 2 or 3 when executing the program.

11. A non-transitory computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the speech recognition-based episode hopping method according to claim 2 or 3.