CN110337041B

CN110337041B - Video playing method and device, computer equipment and storage medium

Info

Publication number: CN110337041B
Application number: CN201910627928.5A
Authority: CN
Inventors: 蒋伟
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-12
Filing date: 2019-07-12
Publication date: 2020-11-17
Anticipated expiration: 2039-07-12
Also published as: CN110337041A

Abstract

The application relates to a video playing method, a video playing device, computer equipment and a storage medium. The method comprises the following steps: playing reading audio, wherein the reading audio is voice audio of a target article read by human voice; acquiring a first keyword in the target article, wherein the first keyword is a keyword corresponding to the playing progress of the reading audio; and sending a video playing instruction to a video application according to the first keyword, wherein the video playing instruction is used for indicating the video application to play a video corresponding to the first keyword. The method and the device realize the creation of the scenes in the target article in a video form, thereby expanding the manner of creating the scenes of the article and improving the expression effect of the scenes described by the characters.

Description

Video playing method and device, computer equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of audio and video application, in particular to a video playing method and device, computer equipment and a storage medium.

Background

The audio electronic book is a new electronic book reading application, which can play the text content in the article in the form of real voice or electronic synthesized voice.

In the related art, a user usually adds a point-of-mind structure to a voiced e-book, such as a word-rate-suppressing and pause-suppressing to create a text description scene, so as to achieve the goal of being popular. For example, a facilitator of a talking e-book reads an article by a professional dubber and records the sound of the dubber when reading to obtain the reading audio.

However, in the solutions shown in the related arts, only when the dubbing personnel reads the dubbing, a certain scene is constructed by reading the sound, and the application range of creating the scene mood corresponding to the article is small, which results in poor expression effect on the scene described by the text.

Disclosure of Invention

The embodiment of the application provides a video playing method, a video playing device, a computer device, a storage medium method, a computer device, and a storage medium, which can expand the way of creating the scenes of articles and improve the expression effect of the scenes described by characters, and the technical scheme is as follows:

in one aspect, a video playing method is provided, where the method is performed by a computer device, and the method includes:

playing reading audio, wherein the reading audio is voice audio of a target article read by human voice;

acquiring a first keyword in the target article, wherein the first keyword is a keyword corresponding to the playing progress of the reading audio;

and sending a video playing instruction to a video application according to the first keyword, wherein the video playing instruction is used for indicating the video application to play a video corresponding to the first keyword.

In another aspect, a video playing method is provided, where the method is performed by a computer device, and the method includes:

displaying a text interface, wherein the text interface comprises a text corresponding to the playing progress of the reading audio;

when receiving a video playing triggering operation executed on a text displayed in the text interface, triggering a video application to play a video corresponding to a first keyword in the target article, wherein the first keyword is a keyword corresponding to the playing progress of the reading audio;

and pausing the playing of the reading audio in the process that the video application plays the video corresponding to the first keyword.

In another aspect, a video playing apparatus is provided, which is used in a computer device, and includes:

the audio playing module is used for playing reading audio which is voice audio of a human voice reading target article;

the keyword acquisition module is used for acquiring a first keyword in the target article, wherein the first keyword is a keyword corresponding to the playing progress of the reading audio;

and the instruction sending module is used for sending a video playing instruction to a video application according to the first keyword, wherein the video playing instruction is used for indicating the video application to play a video corresponding to the first keyword.

Optionally, the keyword obtaining module is configured to,

acquiring a current playing time stamp of the reading audio;

and acquiring the first keyword corresponding to the current playing time stamp.

Optionally, when the first keyword corresponding to the current playing time stamp is obtained, a keyword obtaining module, configured to,

acquiring an appointed playing time period according to the current playing time stamp, wherein the appointed playing time period is a time period which is behind the current playing time stamp and has a time length between the current playing time stamp and the appointed playing time period as an appointed time length;

and acquiring the keywords of the target article, the reading time of which is within the appointed playing time period, as the first keywords.

Optionally, the keyword obtaining module is configured to,

determining a target paragraph in the target article corresponding to the playing progress of the reading audio;

and acquiring the keywords contained in the target paragraph as the first keyword.

Optionally, the keyword obtaining module is configured to,

and when receiving a video playing triggering operation executed on the text displayed in the text interface, acquiring the first keyword corresponding to the video playing triggering operation.

Optionally, the instruction sending module is used for

Sending a first playing instruction containing the first keyword to the video application, wherein the first playing instruction is used for indicating the video application to search for and play videos corresponding to the first keyword;

alternatively, the first and second electrodes may be,

searching the video corresponding to the first keyword to obtain a search result; and sending a second playing instruction containing the search result to the video application, wherein the second playing instruction is used for indicating the video application to play the video corresponding to the first keyword according to the search result.

Optionally, before searching for the video corresponding to the first keyword and obtaining the search result, the instruction sending module is further configured to,

acquiring a second keyword in the target article, wherein the second keyword is a keyword in the context of the paragraph where the first keyword is located;

when searching the video corresponding to the first keyword and obtaining the search result, the instruction sending module is used for,

searching by taking the first keyword as a search keyword to obtain at least one matched video;

and screening the at least one video according to the second keyword to obtain a video corresponding to the first keyword.

Optionally, the first playing instruction further includes a second keyword, where the second keyword is a keyword in the context of the paragraph where the first keyword is located.

Optionally, the apparatus further comprises:

and the pause module is used for pausing the playing of the reading audio in the process that the video application plays the video corresponding to the first keyword.

Optionally, the apparatus further comprises:

the triggering mode obtaining module is used for obtaining the triggering mode of the first keyword before the pause module pauses the playing of the reading audio, and the triggering mode comprises user operation triggering or automatic triggering;

the pause module is used for executing the step of pausing the playing of the reading audio in the process of playing the video corresponding to the first keyword by the video application when the triggering mode is the user operation triggering.

Optionally, the computer device is a terminal, and the video application is an application in the terminal;

alternatively, the first and second electrodes may be,

the video application is an application installed in a terminal other than the terminal.

In another aspect, a computer device is provided, the computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the video playback method as described above.

In another aspect, a computer-readable storage medium is provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored, which is loaded and executed by a processor to implement the video playback method as described above.

The technical scheme provided by the application can comprise the following beneficial effects:

by acquiring the first keyword corresponding to the playing progress of the reading audio when the reading audio is played and indicating the video application to play the video corresponding to the first keyword, the scene in the target article is created in a video form, so that the method for creating the scene of the article is expanded, and the expression effect of the scene described by the characters is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a system architecture diagram of a video playback system to which various embodiments of the present application relate;

FIG. 2 is a flow diagram illustrating a method of video playback in accordance with an exemplary embodiment;

FIG. 3 is a schematic diagram of a video playing flow according to the embodiment shown in FIG. 2;

FIG. 4 is a flow diagram illustrating a method of video playback in accordance with an exemplary embodiment;

fig. 5 is a schematic view of a video playing flow according to the embodiment shown in fig. 4;

FIG. 6 is a schematic diagram of the playing logic of a video and a read audio according to the embodiment shown in FIG. 4;

FIG. 7 is a flow diagram illustrating a method of video playback in accordance with an exemplary embodiment;

fig. 8 is a schematic view of a video playing flow according to the embodiment shown in fig. 7;

FIG. 9 is a schematic diagram illustrating an audio playback flow of a talking ebook application, according to an illustrative embodiment;

fig. 10 is a block diagram showing the structure of a video playback apparatus according to an exemplary embodiment;

FIG. 11 is a block diagram illustrating a computer device in accordance with an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The embodiment of the application provides a scheme for playing videos in the process of playing and reading audio, and the scheme can control playing and current reading progress matched videos according to the reading progress of the reading audio so as to create a sense of mind which is more intuitive compared with sound.

Referring to fig. 1, a system architecture diagram of a video playback system according to various embodiments of the present application is shown. As shown in fig. 1, the video playing system includes a first terminal 110 and a server 120.

The first terminal 110 may be a terminal device having a sound playing function, for example, the first terminal 110 may be a mobile phone, a tablet computer, an electronic book reader, smart glasses, a smart watch, an MP3 player (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), a laptop portable computer, a desktop computer, and the like; alternatively, the first terminal 110 may also be an intelligent home device, such as an intelligent television or an intelligent television set-top box.

The first terminal 110 contains an acoustic electronic book application.

Optionally, the first terminal 110 further includes a video application.

Optionally, the video playing system further includes a second terminal 130 connected to the first terminal 110 through a communication network. The second terminal 130 contains a video application.

The second terminal 130 and the first terminal 110 may be the same type of terminal, for example, the first terminal 110 and the second terminal 130 may both be mobile phones. Alternatively, the second terminal 130 and the first terminal 110 may be different types of terminals; for example, the first terminal 110 may be a mobile phone, and the second terminal 130 may be a smart television or a smart television set-top box.

The server 120 may be a server, a server cluster composed of several servers, a virtualization platform, or a cloud computing service center.

Optionally, the server 120 includes a server corresponding to the electronic book application.

Optionally, the server 120 includes a server corresponding to the video application.

The first terminal 110 and the server 120 are connected through a communication network. Optionally, the communication network is a wired network or a wireless network.

Optionally, the wireless network or wired network described above uses standard communication techniques and/or protocols. The Network is typically the Internet, but may be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wireline or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including Hypertext Mark-up Language (HTML), Extensible Markup Language (XML), and the like. All or some of the links may also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), Internet Protocol Security (IPsec). In other embodiments, custom and/or dedicated data communication techniques may also be used in place of, or in addition to, the data communication techniques described above.

People mainly adopt auditory sense and visual sense, the prior sound electronic books show article contents to users through auditory sense (reading audio frequency) instead of visual sense (text interface), although the auditory sense can enrich the imagination of people from the sense of sense, the users can form scenes in a book through imagination while listening to reading video, and the purposes of fusion and thorough experience are achieved. However, many times, the visual impact is a scene that can more realistically embody an article than the user imagines.

The scheme shown in the subsequent embodiment of the application combines the hearing (reading audio) and the vision (video), and under most use scenes, a user can imagine the scenes in the article through the hearing. In some scenes, such as the scene of the five army in the ring king, visual impact is difficult to achieve through the creation of self imagination and the inhibition of the words, and at the moment, the terminal can optimize the reading experience through videos, for example, through searching videos through the words, movie fragments of the five army wars or other war fragments with high impact are searched and played, so that the user can deepen the understanding of the words, and the expression effect of the scene described by the article is improved. The product form of sense of hearing + vision that this application provided enriches user's sense organ experience all-round, brings the user and is superior to the product experience of competing for the article, can enlarge the influence and the market share of product.

Fig. 2 is a flowchart illustrating a video playing method according to an exemplary embodiment, where the video playing method may be executed by a computer device, for example, the method may be executed by a terminal (e.g., a first terminal) in the system shown in fig. 1, or the method may be executed by a server in the system shown in fig. 1, or the method may be executed by both the terminal and the server in the system shown in fig. 1. In the embodiment of the present application, the video playing method is executed by a terminal as an example, as shown in fig. 2, the method may include the following steps:

and step 21, playing the reading audio, wherein the reading audio is the voice audio of the target article read by the human voice.

Step 22, obtaining a first keyword in the target article, where the first keyword is a keyword corresponding to the playing progress of the reading audio.

And step 22, sending a video playing instruction to the video application according to the first keyword, wherein the video playing instruction is used for instructing the video application to play the video corresponding to the first keyword.

In the embodiment of the present application, the video application may be a video application installed in a terminal (such as the first terminal in fig. 1) that plays read audio, or the video application may also be an application installed in another terminal (such as the second terminal in fig. 1) besides the terminal.

For example, in one possible implementation manner, an application interface a of a sound electronic book application is shown in a screen of the terminal, and the sound electronic book application plays reading audio under the application interface a. After the audio electronic book application acquires a first keyword corresponding to the playing progress of the current reading audio, an instruction can be sent to a video application installed in the terminal, then, a screen of the terminal jumps to display an application interface b of the video application, and the video application plays a video corresponding to the first keyword in the application interface b.

Or, in another possible implementation manner, the terminal may also instruct a video application installed in another terminal to play the video corresponding to the first keyword. For example, the audio e-book application in the terminal a plays the reading audio under the application interface a. After the audio electronic book application acquires a first keyword corresponding to the playing progress of the current reading audio, an instruction is sent to a video application in the terminal B, a screen of the terminal B skips to display an application interface c of the video application, and the video application plays a video corresponding to the first keyword in the application interface c.

For example, please refer to fig. 3, which shows a schematic diagram of a video playing process according to an embodiment of the present application. When the video application is installed in a terminal other than the terminal playing the read audio, taking the other terminal as an example of the smart phone, the scheme shown in fig. 3 may be as follows:

31) in the process of playing the reading audio, the terminal judges whether to connect the smart television or not; if yes, go to step 32, otherwise, end the process.

32) And the terminal judges whether to trigger video search, if so, the process enters 33), and if not, the process is ended.

33) And (4) whether the video application in the intelligent television searches the video, if so, entering a step 34, and otherwise, ending the process.

34) And the video application in the intelligent television plays the searched video.

For example, in the process of listening to a book, a video application installed in a smart television connected to a living room is automatically searched by a talking electronic book application, and when a certain reading scene of a user is not satisfied with a simple character sound, or when a terminal automatically triggers video playing, the operation of searching for a video by characters can be triggered. The audio electronic book application sends the instruction to a video application side in the smart television, and the video application displays the video which best meets the scene according to the instruction so as to be enjoyed by the user.

To sum up, in the scheme shown in the embodiment of the present application, when the terminal plays the reading audio, the terminal may obtain the first keyword corresponding to the playing progress of the reading audio, and instruct the video application to play the video corresponding to the first keyword, so as to implement creating a scene in a target article in a video form, thereby expanding a manner of creating the scene of the article and improving an expression effect of the scene described in the text.

In the scheme shown in the embodiment shown in fig. 2, the corresponding video needs to be searched according to the first keyword, and the process of searching the corresponding video according to the first keyword can be executed by the video application, that is, the video application searches for the corresponding video according to the first keyword and plays the video; alternatively, the process of searching for the corresponding video according to the first keyword may be executed outside the video application (for example, executed by an audio electronic book application), that is, after the terminal searches for the corresponding video according to the first keyword, the terminal notifies the video application of the search result, and the video application plays the video. The following embodiments of the present application will respectively describe implementation manners of these two video searches.

Fig. 4 is a flowchart illustrating a video playing method according to an exemplary embodiment, where the video playing method may be executed by a computer device, for example, the method may be executed by a terminal (e.g., a first terminal) in the system shown in fig. 1, or the method may be executed by a server of the system shown in fig. 1, or the method may be executed by both the terminal and the server in the system shown in fig. 1. In the embodiment of the present application, the video playing method is executed by a terminal, and a process of searching for a corresponding video according to a first keyword is executed by a video application as an example, as shown in fig. 4, the method may include the following steps:

step 401, playing a reading audio, where the reading audio is a voice audio of a human voice reading target article.

In this embodiment of the application, an acoustic electronic book application may be installed in the terminal, and when the acoustic electronic book application runs in the foreground, the user may select to play the reading audio of the target article in an application interface of the acoustic electronic book application, and at this time, the terminal may obtain the reading audio of the target article selected by the user from a server or a local area corresponding to the acoustic electronic book application, and play the acoustic electronic book application.

Step 402, obtaining a first keyword in the target article, where the first keyword is a keyword corresponding to the playing progress of the reading audio.

In the embodiment of the application, when the user uses the audio electronic book, the user can trigger the video playing for creating the mood of the article when needing to watch the video by manually triggering the step of video search (namely triggering to acquire the first keyword in the target article). However, there are many times that the user listens to the book in other work scenes, and the scheme of completely triggering the video playing by the user manually cannot meet the user requirement. In order to meet different user requirements, in the scheme shown in the embodiment of the application, a switch for automatically triggering video playing can be provided for a user, and as long as the user opens the switch, the terminal can automatically trigger the video searching process through pre-marked text points (namely keywords) or automatically extracted text points, so that the played video is automatically displayed for the user; and if the user closes the switch, the video playing can be actively triggered by the user.

The method for the terminal to obtain the first keyword may include, but is not limited to, the following three methods:

firstly, the terminal can directly acquire the corresponding first keyword according to the playing time stamp.

For example, in the process of playing the reading audio, the terminal may obtain a current playing time stamp of the reading audio; and obtaining the first keyword corresponding to the current playing time stamp.

In a possible implementation manner, when an operation and maintenance person of an acoustic electronic book application marks a relevant keyword in a target article in advance, the marked keyword and a play timestamp triggering video search through the keyword can be bound, and when a current timestamp played by a terminal is a timestamp bound with the keyword, the terminal can directly acquire the keyword bound with the current play timestamp as the first keyword.

When the terminal acquires the keyword bound to the current playing timestamp as the first keyword, the first keyword can be inquired according to the corresponding relation between the timestamp and the keyword, which is locally stored in the terminal, wherein the corresponding relation between the timestamp and the keyword can be downloaded from the server and stored locally when the audio file of the reading audio is downloaded by the audio electronic book application.

Or, the correspondence between the timestamp and the keyword may also be stored in a server applied to the audio electronic book, the terminal may also send a query instruction including the current timestamp to the server applied to the audio electronic book, and the server queries and obtains the first keyword from the correspondence between the timestamp and the keyword and feeds the first keyword back to the terminal.

In another possible implementation manner, when obtaining the first keyword corresponding to the current playing time stamp, the terminal may obtain, according to the current playing time stamp, a specified playing time period, where the specified playing time period is a time period after the current playing time stamp and a time period between a distance from the current playing time stamp to the current playing time stamp is a specified time period; and acquiring the keywords of the target article, the reading time of which is within the appointed playing time period, as the first keywords.

Since steps of searching, downloading, playing and the like are generally required between the triggering of the video searching and the playing of the video, and a certain time delay may exist, in order to match the playing progress with the played video as much as possible, in the embodiment of the present application, when the terminal is automatically triggered (for example, automatically triggered according to a preset time interval) or the user manually triggers the playing of the video, the terminal may obtain the first keyword within a certain period of time after the current playing time stamp when obtaining the first keyword in the target article, so as to trigger the video searching in advance, so that the situation created by the video is matched with the playing progress of reading the audio when the video is played, thereby improving the effect of creating the situation.

For example, when an operation and maintenance person of the audio electronic book application marks a relevant keyword in a target article in advance, the marked keyword and a playing time stamp of the keyword in a reading audio may be bound, during the playing of the reading audio, the terminal periodically obtains (for example, obtains once every 10 seconds) a current playing time stamp (assuming that the current playing time stamp is 00:56:20) through an audio electronic book reader, and then obtains a specified playing time period with a time length of 10s and a time duration of 5s (corresponding to the specified time duration) between the current playing time stamps according to the current playing time stamp, where the specified playing time period is (00:56:25, 00:56:35), and then the terminal obtains the keyword whose playing time stamp in the reading audio is within (00:56:25, 00:56:35), where the time period (00:56:25, 00:56:35), the keyword is obtained as the first keyword.

The above scheme is described by taking an example that an operation and maintenance person manually marks a keyword in a target article in advance, and optionally, the keyword in the target article may also be automatically marked by a machine learning model trained in advance.

The specified duration and the duration of the specified playing time period may be durations preset by developers or users.

Alternatively, the specified time duration may be a time duration determined by the terminal according to a delay with a server to which the audio electronic book is applied.

And secondly, the terminal can obtain the corresponding keywords according to the current paragraph.

For example, in the process of playing the reading audio, the terminal may determine a target paragraph in the target article corresponding to the playing progress of the reading audio; and acquiring the keywords contained in the target paragraph as the first keyword.

In a possible implementation manner, when the e-book application plays the reading audio, if the playing is triggered automatically (for example, the reading audio of a new paragraph starts to be played), or the video is triggered to be played manually by the user, the terminal may determine the paragraph in the target article currently played, and obtain the first keyword from the determined paragraph.

The first keyword may be a keyword extracted from a paragraph by a terminal, or a keyword pre-marked in the paragraph by an operation and maintenance person of the audio electronic book application.

For example, in one possible implementation manner, after determining a paragraph corresponding to the current playing progress of the reading audio, the terminal extracts a text in the paragraph, and extracts the first keyword from the paragraph through a keyword extraction algorithm.

Or, when an operation and maintenance person of the acoustic electronic book application marks a relevant keyword in a target article in advance, the marked keyword and a paragraph mark where the keyword is located may be bound, and after the terminal determines to read a paragraph corresponding to the current playing progress of the audio, the first keyword may be queried locally or by the server according to the paragraph mark corresponding to the current playing progress.

Thirdly, the terminal can also determine the corresponding keywords according to the operation of the user.

For example, in the process of playing the reading audio, the terminal may display a text interface, where the text interface includes a text corresponding to the playing progress of the reading audio; and when receiving a video playing triggering operation executed on the text displayed in the text interface, acquiring the first keyword corresponding to the video playing triggering operation.

In a possible implementation manner, the first keyword may be selected by the user. For example, in the text interface, the portion marked as the keyword may be highlighted (for example, underlined or with a background color), at this time, if the user needs to trigger the video playing, the highlighted text may be clicked, and the terminal acquires the text corresponding to the user click operation as the first keyword.

Or, in another possible implementation manner, the user may directly select a text in the text interface through a text selection operation, for example, after the user selects the text in the text interface, a menu bar pops up, the menu bar may include a "search video" option, and if the user clicks the "search video" option, the terminal may acquire the selected text as the first keyword, or extract the first keyword from the selected text through a keyword extraction algorithm.

Step 403, sending a first playing instruction containing the first keyword to the video application, where the first playing instruction is used to instruct the video application to search for and play a video corresponding to the first keyword.

The video application is an application in a terminal for playing and reading audio; alternatively, the video application is an application installed in a terminal other than the terminal.

In the embodiment of the application, the video application can provide a calling interface for the outside, the audio electronic book application can call the calling interface, the first keyword is sent to the video application through an instruction, and after receiving the instruction, the video application searches for the matched video from the server corresponding to the video application according to the first keyword and plays the video.

Taking the example that each keyword in the target article is a keyword pre-marked in the background, and the video application is a video application installed in the smart television, the whole software framework for playing the video may include three parts: the scheme disclosed by the application needs to get through the interconnection of the three parts. The functions of the three parts are mainly as follows:

1) e, E-book background: in the electronic book background, the operation and maintenance personnel manually or automatically mark the target article through machine learning, the user may be interested in, and needs to search the text content of the video, and make keyword pre-embedding, for example, generate and store the corresponding relationship between the keyword and the timestamp/paragraph mark.

2) Talking electronic book applications: the audio electronic book application receives electronic book contents (including reading audio of a target article) sent by an electronic book background, and triggers video search according to manual operation of a user or automatically triggers video search.

3) Video application in smart tv: the video application may receive a text search video command, perform a search, and display video content for the user to play.

Please refer to fig. 5, which shows a video playing flow diagram according to an embodiment of the present application. As shown in fig. 5, the electronic book background provides reading audio of the target article to the talking electronic book application in the terminal (S51). The talking ebook application in the terminal starts playing the reading audio (S52). When the audio e-book application in the terminal triggers a video search, a first keyword is acquired (S53). The talking electronic book application sends an instruction containing a first keyword to a video application in the smart tv (S54), and the video application searches for a video according to the first keyword (S55) and plays the video when the video is searched (S56).

According to the scheme, the use scenes of the smart television in the living room are enriched while the audio reading experience is improved, a new flow entrance is created for the smart television, namely, the video application is pulled up through the audio novel, and the user can continue to use the smart television with a certain probability after watching the segment.

In the embodiment of the application, the video application in the smart television can be extended to any video application product, for example, a video application product interfacing with a sound electronic book application can be covered from a single living room scene to a scene containing all video applications.

Optionally, the first playing instruction further includes a second keyword, where the second keyword is a keyword in the context of the paragraph where the first keyword is located. The video application may search for videos corresponding to the first keyword to obtain search results, and then filter the at least one video according to the second keyword to obtain videos corresponding to the first keyword.

In the embodiment of the application, because the video searched according to the first keyword may include a plurality of videos, or the video searched according to the first keyword may correspond to only the first keyword but not match the context of the target article, in order to improve the accuracy of video search and situation creation, the scheme shown in the application may further send a second keyword corresponding to the context of the target article to the video application, and filter a suitable video from the search result of the first keyword through the second keyword to play.

For example, assuming that the first keyword is "battlefield", the second keyword is "gun battle", the video application searches according to the first keyword "battlefield", and the obtained search result includes a video clip 1 of a modern war and a video clip 2 of a cold weapon war, and then the video application further screens out the video clip 1 as the video corresponding to the first keyword through the second keyword "gun battle".

Optionally, in the process that the video application plays the video corresponding to the first keyword, the playing of the read audio is paused.

In the embodiment of the application, in order to improve the mood creating effect of the video, the terminal can pause playing and reading the audio in the process of playing the video. Or, in the process of playing the video, the terminal can also continue to play the reading audio.

The setting interface of the audio e-book application may include a switch for whether to pause playing and reading the audio when playing the video, if the switch is set to be turned on by the user, the terminal pauses playing and reading the audio in the process of playing the video, otherwise, if the switch is set to be turned off by the user, the terminal continues playing and reading the audio in the process of playing the video.

Optionally, in the process of playing the video corresponding to the first keyword by the video application, before the reading audio is paused to be played, the terminal may acquire a trigger mode of the first keyword, where the trigger mode includes user operation trigger or automatic trigger; and when the triggering mode is user operation triggering, the step of pausing the playing of the reading audio in the process of playing the video corresponding to the first keyword by the video application is executed.

In another possible implementation manner, the audio e-book application in the terminal may also automatically select whether to pause playing and reading the audio in the process of playing the video according to the triggering manner of the video playing.

In the embodiment of the application, the logic processing during video search can be divided into automatic triggering and manual triggering, and the subsequent logic processing can be different for different search triggers.

For example, please refer to fig. 6, which illustrates a schematic diagram of a playing logic of video and audio according to an embodiment of the present application. As shown in fig. 6, the talking ebook application in the terminal plays the reading audio of the target article (S61). The sound electronic book application judges whether the user manually triggers the search video (S62), and if the user manually triggers the search video, the playing of the read audio is suspended (S63) and the video application is instructed to search the video (S64); if the user does not trigger the search video manually, continuing to judge whether the search video is triggered automatically or not (S65), if so, entering the step S64, otherwise, returning to the step S61. After S64, the video application determines whether a video is searched (S66), and if a video is searched, acquires a video playing mode (i.e., audio playing or silent playing), plays the searched video according to the playing mode (S67), and returns to S61 to continue to start after the video playing is finished, otherwise returns to S61 directly.

In the scheme, for the user to manually click the trigger, the terminal can pause the voice reading. The video content is mainly used for enriching text products and creating three-dimensional and all-around experience, the searched video is silent by default, only pictures are played, and the voice part is used for reading the voice mood of the app by default. The terminal can also provide options for the user to select whether to open the video sound or not and pause the text voice when playing the video, so that the user can make personalized selection.

Above-mentioned scheme not only can be applied to vocal reading, and it is also better product scheme to ordinary reading to supplement, and normal reading, user need concentrate on, consume a large amount of eyes and mental thinking, also can be through providing the entry in this time, lets the user can be swift go to search for the short video that accords with the context, provides better product experience effect for the user.

To sum up, in the scheme shown in the embodiment of the present application, when the terminal plays the reading audio, the terminal may obtain the first keyword corresponding to the playing progress of the reading audio, and instruct the video application to search and play the video corresponding to the first keyword according to the first keyword, so as to implement creating a scene in a target article in a video form, thereby expanding a manner of creating a scene of the article and improving an expression effect of the scene described by the text.

Fig. 7 is a flowchart illustrating a video playing method according to an exemplary embodiment, where the video playing method may be executed by a computer device, for example, the method may be executed by a terminal (e.g., a first terminal) in the system shown in fig. 1, or the method may be executed by a server of the system shown in fig. 1, or the method may be executed by both the terminal and the server in the system shown in fig. 1. In the embodiment of the present application, the video playing method is executed by a terminal, and a process of searching for a corresponding video according to a first keyword is executed outside a video application, as shown in fig. 7, the video playing method may include the following steps:

step 701, playing a reading audio, wherein the reading audio is a voice audio of a human voice reading target article.

Step 702, obtaining a first keyword in the target article, where the first keyword is a keyword corresponding to the playing progress of the reading audio.

The execution process of step 701 and step 702 may refer to the description under step 401 and step 402 in the embodiment shown in fig. 3, and is not described herein again.

Step 703, searching for the video corresponding to the first keyword to obtain a search result.

In the embodiment of the present application, a server of the talking electronic book application may set or connect a video database, and the video database may collect and store a video that is suitable for triggering playing when the talking electronic book application plays the reading audio, for example, a video with a playing time within a specified time range is stored, so that in the playing and reading audio of the talking electronic book application, the length of the played video is not too short, so that a user cannot visually feel a scene of an article, and is not too long, so as to avoid affecting the talking reading below the article.

After the audio electronic book application acquires the first keyword, a video corresponding to the first keyword can be searched in a video database corresponding to a server of the audio electronic book application, and a search result is obtained.

The search result may include the searched video file, or the search result may also include an identifier of the searched video, or the search result may also include a network address of the searched video, such as a Uniform Resource Locator (URL).

Optionally, before searching for the video corresponding to the first keyword and obtaining the search result, the terminal may further obtain a second keyword in the target article, where the second keyword is a keyword in the context of the paragraph where the first keyword is located.

When searching for the video corresponding to the first keyword and obtaining a search result, the terminal takes the first keyword as a search keyword to search for obtaining at least one matched video; and screening the at least one video according to the second keyword to obtain a video corresponding to the first keyword.

Step 704, sending a second playing instruction containing the search result to the video application, where the second playing instruction is used to instruct the video application to play the video corresponding to the first keyword according to the search result.

In the embodiment of the application, the audio electronic book application can directly send the search result to the video application, and the video application plays the video corresponding to the first keyword according to the video result.

When the search result contains the video searched by the sound electronic book application, the video application can directly play the video.

Or, when the search result includes an identifier of a video searched by the audio electronic book application, the video application may query and acquire a video corresponding to the first keyword from a server of the audio electronic book application according to the identifier of the video, and play the acquired video.

Or, when the search result includes the URL of the video searched by the audio electronic book application, the video application may pull the video corresponding to the first keyword from the server of the audio electronic book application according to the URL of the video, and play the acquired video.

The manner in which the video application plays the video corresponding to the first keyword may refer to the description in step 403 in the embodiment shown in fig. 4, and is not described herein again.

For example, taking an audio electronic book application as an application in a terminal and a video application as an application in a smart television as an example, please refer to fig. 8, which shows a video playing flow diagram according to an embodiment of the present application. As shown in fig. 8, the electronic book background provides reading audio of the target article to the talking electronic book application in the terminal (S81). The talking ebook application in the terminal starts playing the reading audio (S82). The sound electronic book application judges whether to start the automatic triggering video playing (S83), if so, acquires the first keyword when the automatic triggering video playing (S84), otherwise, acquires the first keyword when the user manually triggers the video playing (S85). The sound electronic book application judges whether a video corresponding to the first keyword is searched (S86), if so, an instruction containing a search result is sent to a video application in the smart television (S87), the video application plays the video according to the search result (S88), and if not, the operation returns to S82.

To sum up, in the scheme shown in the embodiment of the present application, when the terminal plays the read audio, the terminal may obtain the first keyword corresponding to the playing progress of the read audio, search for the video corresponding to the first keyword according to the first keyword, and instruct the video application to play the video, thereby implementing creation of a scene in a target article in a video form, expanding a manner of creating the scene of the article, and improving an expression effect of the scene described in the text.

The scheme shown in fig. 2, fig. 4 or fig. 7 can achieve the following use experience: when a user hears a certain section of wonderful segment, the short video describing the character scene at the moment is played for the user accurately through the character retrieval video, so that the user can better experience the text content, and if the user hears a section of fighting scene, a wonderful fighting short video is displayed for the user. The sense of the user can be enriched, meanwhile, part of short videos can attract the user to watch the complete episode of the videos, and the purpose of drainage is achieved.

For example, please refer to fig. 9, which is a schematic diagram illustrating an audio playing flow of an audio e-book application according to an exemplary embodiment. The audio playing process may be executed by a computer device, for example, the process may be executed by a terminal (e.g., a first terminal) in the system shown in fig. 1, or the process may be executed by a server of the system shown in fig. 1, or the process may be executed by both the terminal and the server in the system shown in fig. 1. In the embodiment of the present application, the audio playing process is executed by the terminal as an example, as shown in fig. 9, when the reading audio of the target article is played, the interface display and operation process of the audio e-book application may be as follows:

step 901, playing a reading audio, where the reading audio is a voice audio of a human reading target article.

Step 902, displaying a text interface, wherein the text interface includes a text corresponding to the playing progress of the reading audio.

Step 903, when receiving a video playing triggering operation executed on the text displayed in the text interface, triggering the video application to play a video corresponding to a first keyword in the target article, where the first keyword is a keyword corresponding to the playing progress of the reading audio.

Step 904, pausing the playing of the read audio while the video application plays the video corresponding to the first keyword.

In the application embodiment, a scene in which a user easily perceives that video playing is manually triggered is taken as an example, and an interface and operation of a sound electronic book application are introduced. For example, when the audio e-book application in the mobile phone of the user plays and reads audio, the text interface corresponding to the currently read text content is also displayed, when the user wants to know the scene of the current text content through video, the corresponding keyword in the text interface can be triggered, at this time, the audio e-book application pauses playing and reading audio, meanwhile, the video application in the smart television in the living room connected with the mobile phone of the user or the video application installed in the terminal starts playing the video corresponding to the keyword, and after the video playing is finished, the audio e-book application in the mobile phone continues playing and reading audio.

Fig. 10 is a block diagram illustrating a structure of a video playback apparatus according to an exemplary embodiment. The video playing device can be used in a computer device to execute all or part of the steps in the above method embodiments of the present application. The video playing apparatus may be used in a terminal (such as the first terminal) in the system shown in fig. 1, or the video playing apparatus may be used in a server of the system shown in fig. 1, or the video playing apparatus may also be distributed in the terminal and the server of the system shown in fig. 1. The video playback apparatus may include:

the audio playing module 1001 is configured to play a reading audio, where the reading audio is a voice audio of a human voice reading target article;

a keyword obtaining module 1002, configured to obtain a first keyword in the target article, where the first keyword is a keyword corresponding to a playing progress of the reading audio;

an instruction sending module 1003, configured to send a video playing instruction to a video application according to the first keyword, where the video playing instruction is used to instruct the video application to play a video corresponding to the first keyword.

Optionally, the keyword obtaining module 1002 is configured to,

acquiring a current playing time stamp of the reading audio;

Optionally, upon retrieving the first keyword corresponding to the current playing time stamp, a keyword retrieving module 1002, configured to,

Optionally, the keyword obtaining module 1002 is configured to,

Optionally, the instruction sending module 1003 is configured to send an instruction to the user

alternatively, the first and second electrodes may be,

Optionally, before searching for the video corresponding to the first keyword to obtain the search result, the instruction sending module 1003 is further configured to,

when searching for a video corresponding to the first keyword to obtain a search result, the instruction sending module 1003 is configured to,

Optionally, the apparatus further comprises:

Optionally, the video application is an application in the terminal;

alternatively, the first and second electrodes may be,

Fig. 11 shows a block diagram of a computer device 1100 provided in an exemplary embodiment of the present application. The computer device 1100 may be the first terminal, the second terminal, or the server in the system shown in fig. 1.

Generally, the computer device 1100 includes: a processor 1101 and a memory 1102.

Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). Processor 1101 may also include a main processor and a coprocessor. In some embodiments, the processor 1101 may be integrated with a GPU (Graphics Processing Unit), and the processor 1101 may further include an AI (Artificial Intelligence) processor for Processing computing operations related to machine learning.

Memory 1102 may include one or more computer-readable storage media, which may be non-transitory. Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1102 is used to store at least one instruction for execution by processor 1101 to implement all or part of the steps of the above-described method embodiments of the present application.

In some embodiments, the computer device 1100 may also optionally include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1104, touch display screen 1105, camera 1106, audio circuitry 1107, positioning component 1108, and power supply 1109.

The peripheral interface 1103 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1101 and the memory 1102.

The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. Optionally, the radio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1104 may communicate with other computer devices via at least one wireless communication protocol. In some embodiments, the rf circuit 1104 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1105 is used to display a UI (User Interface). When the display screen 1105 is a touch display screen, the display screen 1105 also has the ability to capture touch signals on or over the surface of the display screen 1105.

Camera assembly 1106 is used to capture images or video. In some embodiments, camera assembly 1106 may also include a flash.

The audio circuitry 1107 may include a microphone and a speaker. In some embodiments, the audio circuitry 1107 may also include a headphone jack.

The Location component 1108 is used to locate the current geographic Location of the computer device 1100 for navigation or LBS (Location Based Service).

The power supply 1109 is used to provide power to the various components within the computer device 1100.

In some embodiments, the computer device 1100 also includes one or more sensors 1110. The one or more sensors 1110 include, but are not limited to: acceleration sensor 1111, gyro sensor 1112, pressure sensor 1113, fingerprint sensor 1114, optical sensor 1115, and proximity sensor 1116.

Those skilled in the art will appreciate that the configuration illustrated in FIG. 11 does not constitute a limitation of the computer device 1100, and may include more or fewer components than those illustrated, or may combine certain components, or may employ a different arrangement of components.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as a memory comprising computer program (instructions), executable by a processor of a computer device to perform all or part of the steps of the various method embodiments of the present application is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A video playback method, the method being performed by a computer device, the method comprising:

sending a video playing instruction to a video application according to the first keyword, wherein the video playing instruction is used for indicating the video application to play a video corresponding to the first keyword; the video application is a different application than the application playing the read audio.

2. The method of claim 1, wherein the obtaining the first keyword in the target article comprises:

acquiring a current playing time stamp of the reading audio;

3. The method of claim 2, wherein the obtaining the first keyword corresponding to the current playing timestamp comprises:

4. The method of claim 1, wherein the obtaining the first keyword in the target article comprises:

and acquiring the keywords contained in the target paragraph as the first keywords.

5. The method of claim 1, wherein the obtaining the first keyword in the target article comprises:

6. The method of claim 1, wherein sending video playing instructions to a video application according to the first keyword comprises:

sending a first playing instruction containing the first keyword to the video application, wherein the first playing instruction is used for instructing the video application to search for a video corresponding to the first keyword and play the video;

alternatively, the first and second electrodes may be,

searching videos corresponding to the first key words to obtain search results; and sending a second playing instruction containing the search result to the video application, wherein the second playing instruction is used for indicating the video application to play the video corresponding to the first keyword according to the search result.

7. The method of claim 6, wherein the searching for the video corresponding to the first keyword further comprises, before obtaining the search result:

the searching for the video corresponding to the first keyword to obtain a search result comprises:

8. The method according to claim 6, wherein the first playback command further includes a second keyword, and the second keyword is a keyword in the context of the paragraph where the first keyword is located.

9. The method of claim 1, further comprising:

10. The method according to claim 9, wherein before pausing the playing of the read audio during the playing of the video corresponding to the first keyword by the video application, further comprising:

acquiring a triggering mode of the first keyword, wherein the triggering mode comprises user operation triggering or automatic triggering;

and when the triggering mode is user operation triggering, the step of pausing the playing of the reading audio in the process of playing the video corresponding to the first keyword by the video application is executed.

11. The method according to any of claims 1 to 10, wherein the computer device is a terminal,

the video application is an application in the terminal;

alternatively, the first and second electrodes may be,

12. A video playback method, the method being performed by a computer device, the method comprising:

when receiving a video playing triggering operation executed on a text displayed in the text interface, triggering a video application to play a video corresponding to a first keyword in the target article, wherein the first keyword is a keyword corresponding to the playing progress of the reading audio; the video application is a different application than the application playing the read audio;

13. A video playback apparatus, the apparatus being used in a computer device, the apparatus comprising:

the instruction sending module is used for sending a video playing instruction to a video application according to the first keyword, wherein the video playing instruction is used for indicating the video application to play a video corresponding to the first keyword; the video application is a different application than the application playing the read audio.

14. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the video playback method as claimed in any one of claims 1 to 12.

15. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement a video playback method as claimed in any one of claims 1 to 12.