CN108174269B - Visual audio playing method and device - Google Patents

Visual audio playing method and device Download PDF

Info

Publication number
CN108174269B
CN108174269B CN201711453171.XA CN201711453171A CN108174269B CN 108174269 B CN108174269 B CN 108174269B CN 201711453171 A CN201711453171 A CN 201711453171A CN 108174269 B CN108174269 B CN 108174269B
Authority
CN
China
Prior art keywords
video
picture
plot
playing time
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711453171.XA
Other languages
Chinese (zh)
Other versions
CN108174269A (en
Inventor
张磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN201711453171.XA priority Critical patent/CN108174269B/en
Publication of CN108174269A publication Critical patent/CN108174269A/en
Application granted granted Critical
Publication of CN108174269B publication Critical patent/CN108174269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The disclosure relates to a visual audio playing method and device. The method comprises the following steps: when the playing mode aiming at the video is a play listening mode which does not play video pictures, playing the audio corresponding to the video; determining a first plot picture matched with the current playing time according to the current playing time of the audio; and displaying the first plot picture. According to the visual audio playing method and device provided by the embodiment of the disclosure, when the playing mode of the video is a play listening mode in which a video picture is not played, the audio corresponding to the video is played, and a first plot picture matched with the current playing time of the audio is displayed. The user can determine the specific content of the video through the played audio and the displayed first plot picture in the drama listening mode, and the requirements of the user on hearing and vision are met.

Description

Visual audio playing method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for playing visual audio.
Background
With the progress of science and technology, different requirements of different users in video playing are met. In the related art, when a user selects or uses mobile data to play a video in the process of playing the video online through a mobile phone, a tablet computer and other terminal devices, the user can play the audio corresponding to the video. However, in the related art, when the audio corresponding to the video is played for the user, the user can only listen to the audio, and the visual requirement of the user cannot be met.
Disclosure of Invention
In view of this, the present disclosure provides a visual audio playing method and device, so as to meet the requirement of a user on the vision in the process of playing an audio corresponding to a video for the user.
According to a first aspect of the present disclosure, there is provided a visual audio playing method, including:
when the playing mode aiming at the video is a play listening mode which does not play video pictures, playing the audio corresponding to the video;
determining a first plot picture matched with the current playing time according to the current playing time of the audio;
and displaying the first plot picture.
For the above method, in a possible implementation manner, determining, according to the current playing time of the audio, a first scenario picture matched with the current playing time includes:
obtaining a scenario picture corresponding to the cached audio data and a playing time interval corresponding to the scenario picture;
and determining the first plot picture according to the current playing time and the playing time interval corresponding to the plot picture.
For the above method, in a possible implementation manner, determining, according to the current playing time of the audio, a first scenario picture matched with the current playing time includes:
identifying the content of the audio, and determining a video scenario corresponding to the current playing time;
and determining the first plot picture according to the video plot corresponding to the current playing time.
For the above method, in a possible implementation manner, determining the first scenario picture according to the video scenario corresponding to the current playing time further includes:
and acquiring a first plot picture matched with the historical behavior of the user according to the video plot and the historical behavior of the user.
For the above method, in one possible implementation, the first scenario picture includes any one of:
the video system comprises a video play, video frames of the video and pictures generated according to the video play.
According to a second aspect of the present disclosure, there is provided a visual audio playing device, comprising:
the audio playing module plays audio corresponding to the video when the playing mode aiming at the video is a play listening mode which does not play video pictures;
the picture determining module is used for determining a first plot picture matched with the current playing time according to the current playing time of the audio;
and the picture display module is used for displaying the first plot picture.
For the apparatus, in a possible implementation manner, the picture determining module includes:
the obtaining sub-module is used for obtaining a scenario picture corresponding to the cached audio data and a playing time interval corresponding to the scenario picture;
and the first determining submodule determines the first plot picture according to the current playing time and the playing time interval corresponding to the plot picture.
For the apparatus, in a possible implementation manner, the picture determining module includes:
the plot determining submodule is used for identifying the content of the audio and determining a video plot corresponding to the current playing time;
and the second determining submodule determines the first plot picture according to the video plot corresponding to the current playing time.
For the apparatus, in a possible implementation manner, the picture determining module further includes:
and the third determining submodule acquires a first plot picture matched with the historical behavior of the user according to the video plot and the historical behavior of the user.
With regard to the apparatus described above, in one possible implementation, the first scenario picture includes any one of:
the video system comprises a video play, video frames of the video and pictures generated according to the video play.
According to a third aspect of the present disclosure, there is provided a visual audio playing device, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the visual audio playing method.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the above-mentioned visual audio playing method.
According to the visual audio playing method and device provided by the embodiment of the disclosure, when the playing mode of the video is a play listening mode in which a video picture is not played, the audio corresponding to the video is played, and a first plot picture matched with the current playing time of the audio is displayed. The user can determine the specific content of the video through the played audio and the displayed first plot picture in the drama listening mode, and the requirements of the user on hearing and vision are met.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 shows a flow diagram of a visual audio playing method according to an embodiment of the present disclosure;
fig. 2 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure;
fig. 3 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure;
fig. 4 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure;
fig. 5 shows a schematic diagram of an application scenario of a visual audio playing method according to an embodiment of the present disclosure;
fig. 6 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure;
fig. 7 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure;
fig. 8 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Fig. 1 shows a flow chart of a visual audio playing method according to an embodiment of the present disclosure. As shown in fig. 1, the method may be used for terminal devices, such as smart phones, tablet computers, and the like. The method may include steps S11 through S13.
In step S11, when the playback mode for the video is the drama listening mode in which the video screen is not played back, the audio corresponding to the video is played back.
In the present embodiment, whether the play mode for the video is the drama listening mode in which the video screen is not played may be determined according to the network speed of the network, the network conditions such as whether the mobile data network is used, and the like, and the user's operation. For example, a drama listening mode button may be displayed in a window in which a video is played, and in a case where a click operation for the button by a user is detected, the playback mode for the video is determined to be the drama listening mode. The method can also remind the user to play the video in a series listening mode in which the video picture is not played under the condition that the network speed is determined to be too low to meet the requirement of playing the video or the user plays the video by adopting the mobile data network, and determine that the playing mode aiming at the video is the series listening mode in which the video picture is not played under the condition that the user selects the series listening mode. It should be understood that, the specific manner for determining whether the play mode for the video is the series listening mode without playing video pictures may be set by those skilled in the art according to actual needs, and the present disclosure is not limited thereto.
In step S12, a first scenario picture matching the current playing time is determined according to the current playing time of the audio.
In this embodiment, the first scenario picture can represent a video scenario of a video corresponding to an audio played at the current playing time, so that a user can determine the content of the video more accurately by combining the displayed first scenario picture while listening to the audio. The video scenario may include video roles such as characters, animals, and animation images appearing in the video at the current playing time, things occurring between the roles, locations where the roles are located, and the like, which is not limited by the present disclosure.
In this embodiment, the first scenario picture matched with the current playing time and the video scenario may be obtained according to the current playing time of the audio, the video scenario corresponding to the current playing time of the audio, the cached audio data, and the like. For example, the first scenario picture may be obtained in real time according to the determined current playing time of the audio or the video scenario during the playing of the audio. One or more plot pictures corresponding to the cached audio data and a playing time interval corresponding to the plot pictures can be directly obtained from the server, and then the plot pictures of the playing time interval including the current playing time of the audio are determined as the first plot pictures in the process of playing the audio. The specific manner of determining the first scenario picture matched with the current playing time may be set by those skilled in the art according to actual needs, and the disclosure is not limited thereto.
In one possible implementation, the first scenario picture may be a picture related to a video scenario of a video. Such as a movie of a video, video frames of a video, and pictures generated from a video scenario.
In this implementation manner, the picture generated according to the video scenario may be a picture obtained after processing, such as splicing, clipping, adding characters, and the like, is performed on a plurality of video frames corresponding to the video scenario according to the video scenario. For example, the number of video frames corresponding to a video scenario of a certain video is 100, at least one video frame capable of relatively expressing the video scenario is selected according to the video scenario, and is processed according to the video scenario to generate a picture corresponding to the video scenario.
In this implementation, the picture generated from the video scenario may also be a picture generated from the video scenario and the historical behavior of the user. The preference of the user can be determined according to the historical behavior of the user, and then the picture is generated according to the video scenario and the preference of the user. For example, if it is determined that the user M likes the actor Q based on the historical behavior of the user M, when it is determined that the actor Q is included in the video scenario, a picture corresponding to the video scenario and capable of highlighting the actor Q is generated. The manner of highlighting actor Q may include increasing the brightness of the area where actor Q is located, increasing the chroma of the area where actor Q is located, centering the area where actor Q is located in the picture, dynamically displaying the area where actor Q is located, and enlarging the area where actor Q is located in the picture, which is not limited by this disclosure. It should be understood that, the specific manner of generating the pictures according to the video scenario may be set by those skilled in the art according to actual needs, and the present disclosure is not limited thereto.
In step S13, a first scenario picture is displayed.
In this embodiment, the first scenario picture may be displayed when the current playing time belongs to the playing time interval according to the playing time interval corresponding to the first scenario picture, which is not limited in this disclosure.
According to the visual audio playing method provided by the embodiment of the disclosure, when the playing mode of the video is a play listening mode in which a video picture is not played, the audio corresponding to the video is played, and a first plot picture matched with the current playing time of the audio is displayed. The user can determine the specific content of the video through the played audio and the displayed first plot picture in the drama listening mode, and the requirements of the user on hearing and vision are met.
Fig. 2 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure.
In one possible implementation, as shown in fig. 2, step S12 may include step S121 and step S122.
In step S121, a scenario picture corresponding to the buffered audio data and a playing time interval corresponding to the scenario picture are acquired.
In this implementation manner, the corresponding scenario pictures obtained according to the cached audio data may be one or more, each scenario picture corresponds to a segment of playing time interval and corresponds to a video scenario of a video in the segment of playing time interval, and the playing time intervals corresponding to all the obtained scenario pictures may cover the playing time duration of the cached audio data. For example, all the scenario pictures corresponding to the audio W with a playing duration of 3 minutes and a size of 9M and the playing time intervals corresponding to the scenario pictures are as follows: 1-00-01: 00 minutes of a play time interval, 2-01: 00-02: 00 minutes of a play time interval and 3-02: 00-03: 00 minutes of a play time interval. If the audio data of the cached audio W is 6M and the corresponding playing time duration is 2 minutes, acquiring a scenario picture corresponding to the cached audio data and a playing time interval corresponding to the scenario picture are as follows: the scenario picture 1-playing time interval is 00: 00-01: 00 minutes, and the scenario picture 2-playing time interval is 01: 00-02: 00 minutes.
In this implementation manner, the information that can represent the content described in the scenario picture, such as the video scenario corresponding to the scenario picture, may also be included in addition to the corresponding scenario picture and the playing time interval corresponding to the scenario picture, which are acquired according to the cached audio data, and this disclosure does not limit this.
In step S122, a first scenario picture is determined according to the current playing time and the playing time interval corresponding to the scenario picture.
In this implementation, a scenario picture including a current playing time in a playing time interval is determined as a first scenario picture.
Fig. 3 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure.
In one possible implementation, as shown in fig. 3, step S12 may further include step S123 and step S124.
In step S123, the content of the audio is identified, and the video scenario corresponding to the current playing time is determined.
In this implementation, the video scenario may include video roles such as characters, animals, and animation images appearing in the video at the current playing time, things happening between the roles, locations where the roles are located, and the like, which is not limited by this disclosure. For example, the video scenario corresponding to the determined audio at the current playing time is as follows: character 1 and character 2 are quarreling in xx convenience stores. In the process of audio playing, the played audio is identified by adopting modes such as audio identification and the like, the content of the audio is determined, and the video scenario corresponding to the current playing time is determined according to the determined content of the audio. The content of the audio can be identified by those skilled in the art according to actual requirements, and the present disclosure does not limit this.
In step S124, a first scenario picture is determined according to the video scenario corresponding to the current playing time.
In the implementation mode, a plot picture with the same video plot as the video plot corresponding to the current playing time is obtained from a corresponding server according to the video plot corresponding to the current playing time, and the plot picture is determined as a first plot picture. And under the condition that the number of the plot pictures of the obtained video plot is one, which is the same as the video plot corresponding to the current playing time, determining the obtained plot pictures as first plot pictures. When the number of the obtained plot pictures of which the video plots are the same as the video plot corresponding to the current playing time is multiple, the plot pictures of which the playing time intervals include the current playing time in the multiple plot pictures can be further determined as the first plot pictures according to the current playing time. For example, the video scenario of the current playing time of the audio is that "the character 3 feeds a pigeon in the xx square", the obtained scenario picture of the video scenario that "the character 3 feeds a pigeon in the xx square" has a scenario picture 9 and a scenario picture 10, and since the playing time interval of the scenario picture 9 includes the current playing time, the scenario picture 9 is determined as the first scenario picture.
Therefore, the first plot picture is obtained in the process of playing the audio, so that the flow is saved when the network speed is slow, and the playing quality of the audio is ensured; and when the user uses the mobile data to play the audio, the mobile data flow of the user can be saved, and the flow cost of the user is reduced.
Fig. 4 shows a flowchart of step S12 in a visual audio playing method according to an embodiment of the present disclosure.
In one possible implementation, as shown in fig. 4, step S12 may include step S125 in addition to step S123 described above.
In step S125, a first scenario picture matching the historical behavior of the user is obtained according to the video scenario and the historical behavior of the user.
In the implementation mode, according to the video scenario corresponding to the current playing time, scenario pictures of which the video scenario is the same as the video scenario corresponding to the current playing time are obtained. And under the condition that the number of the plot pictures of the obtained video plot is one, which is the same as the video plot corresponding to the current playing time, determining the obtained plot pictures as first plot pictures. Under the condition that the number of the obtained video scenarios is the same as that of the video scenarios corresponding to the current playing time is multiple, the preference of the user can be determined according to the historical behavior of the user, and the scenario picture with the highest similarity to the preference of the user in the multiple scenario pictures with the video scenarios being the same as that of the video scenarios corresponding to the current playing time is determined as the first scenario picture.
In this implementation, after the first scenario picture is determined, the first scenario picture may be further processed according to the historical behavior of the user, and an area corresponding to the content matching the preference of the user in the first scenario picture is highlighted. For example, if it is determined that the user likes actor O according to the user's historical behavior, the avatar of actor O that the user likes may be highlighted. The manner of highlighting may include highlighting, magnifying, etc., and the disclosure is not limited in this regard.
Therefore, the first plot picture displayed for the user can better accord with the preference of the user, and the requirement of the user is met.
It should be noted that, although the above embodiments are described as examples of the visual audio playing method, those skilled in the art can understand that the disclosure should not be limited thereto. In fact, the user can flexibly set each step according to personal preference and/or actual application scene, as long as the technical scheme of the disclosure is met.
Application example
An application example according to the embodiment of the present disclosure is given below in combination with "the user plays the audio corresponding to the video a through the mobile phone" as an exemplary application scenario, so as to facilitate understanding of the flow of the visual audio playing method. It is to be understood by those skilled in the art that the following application examples are for the purpose of facilitating understanding of the embodiments of the present disclosure only and are not to be construed as limiting the embodiments of the present disclosure.
Fig. 5 shows a schematic diagram of an application scenario of a visual audio playing method according to an embodiment of the present disclosure. As shown in fig. 5, the playing time of the video a is 30 minutes, and the playing time of the corresponding audio a' is also 30 minutes. In the case where it is determined that the playback mode for the video a is the audio mode in which the video picture is not played back according to the user's selection, the audio a' corresponding to the video a is played back. And according to the current playing time of the audio A', acquiring a first plot picture matched with the current playing time from the server, and displaying the acquired first plot picture for the user.
In this application example, when the audio a 'starts to be played, the first scenario picture 11 corresponding to the current playing time 00:00 minutes is acquired, the first scenario picture 11 is displayed from 00:00 minutes until the first scenario picture 12 corresponding to the current playing time is acquired when the current playing time is 05:00 minutes, the first scenario picture 12 is displayed from 05:00 minutes until the first scenario picture 13 corresponding to the current playing time is acquired when the current playing time is 11:00 minutes, and the first scenario picture 13 … … is displayed from 11:00 minutes, so that the first scenario picture 11, the first scenario picture 12, the first scenario picture 13, the first scenario picture 14, the first scenario picture 15, and the first scenario picture 16 are sequentially displayed at different playing times within 30 minutes of the audio a'. The first scenario picture 11, the first scenario picture 12, the first scenario picture 13, the first scenario picture 14, the first scenario picture 15, and the first scenario picture 16 may respectively represent video scenarios corresponding to the audio a' in a displayed time interval.
Therefore, when a user listens to the audio A 'corresponding to the video A in the drama listening mode, the user can determine the video content corresponding to the audio A' at the current playing time by combining the displayed first drama picture, and the requirements of the user in the two aspects of auditory sense and visual sense are met.
Fig. 6 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure. As shown in fig. 6, the apparatus may include an audio playing module 61, a picture determining module 62, and a picture displaying module 63. The audio playing module 61 is configured to play audio corresponding to the video when the playing mode for the video is a drama listening mode in which a video screen is not played. The picture determination module 62 is configured to determine a first plot picture matching a current playing time of the audio according to the current playing time. The picture display module 63 is configured to display a first scenario picture.
Fig. 7 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure.
In one possible implementation, as shown in fig. 7, the picture determining module 62 may include an obtaining sub-module 621 and a first determining sub-module 622. The obtaining sub-module 621 is configured to obtain a scenario picture corresponding to the buffered audio data and a playing time interval corresponding to the scenario picture. The first determining sub-module 622 is configured to determine the first scenario picture according to the current playing time and the playing time interval corresponding to the scenario picture.
In one possible implementation, as shown in fig. 7, the picture determination module 62 may include a scenario determination sub-module 623 and a second determination sub-module 624. The scenario determination sub-module 623 is configured to identify the content of the audio, and determine a video scenario corresponding to the current playing time. The second determining sub-module 624 is configured to determine the first scenario picture according to the video scenario corresponding to the current playing time.
In one possible implementation, as shown in fig. 7, the picture determination module 62 may further include a third determination sub-module 625. The third determining sub-module 625 is configured to obtain a first plot picture matching the historical behavior of the user according to the video plot and the historical behavior of the user.
In one possible implementation, the first scenario picture may include any one of: a movie of the video, video frames of the video, and pictures generated from the video scenario.
The visual audio playing device provided by the embodiment of the disclosure plays the audio corresponding to the video when the playing mode for the video is the drama listening mode in which the video picture is not played, and displays the first drama picture matched with the current playing time of the audio. The user can determine the specific content of the video through the played audio and the displayed first plot picture in the drama listening mode, and the requirements of the user on hearing and vision are met.
Fig. 8 shows a block diagram of a visual audio playback device according to an embodiment of the present disclosure. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 8, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the device 800 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (12)

1. A visual audio playing method, comprising:
when the playing mode aiming at the video is a play listening mode which does not play video pictures, playing the audio corresponding to the video;
determining a first plot picture matched with the current playing time according to the current playing time of the audio, wherein the first plot picture is a picture corresponding to a video plot corresponding to the current playing time of the video;
displaying the first plot picture;
wherein the method further comprises:
before the first plot picture is displayed, processing the first plot picture according to the historical behavior of the user so as to highlight the area corresponding to the content matched with the preference of the user in the first plot picture.
2. The method of claim 1, wherein determining a first plot picture matching a current playing time of the audio according to the current playing time comprises:
obtaining a scenario picture corresponding to the cached audio data and a playing time interval corresponding to the scenario picture;
and determining the first plot picture according to the current playing time and the playing time interval corresponding to the plot picture.
3. The method of claim 1, wherein determining a first plot picture matching a current playing time of the audio according to the current playing time comprises:
identifying the content of the audio, and determining a video scenario corresponding to the current playing time;
and determining the first plot picture according to the video plot corresponding to the current playing time.
4. The method according to claim 3, wherein determining the first scenario picture according to the video scenario corresponding to the current playing time further comprises:
and acquiring a first plot picture matched with the historical behavior of the user according to the video plot and the historical behavior of the user.
5. The method of claim 1, wherein the first storyline picture comprises any of:
the video system comprises a video play, video frames of the video and pictures generated according to the video play.
6. A visual audio playback device, comprising:
the audio playing module plays audio corresponding to the video when the playing mode aiming at the video is a play listening mode which does not play video pictures;
the picture determining module is used for determining a first plot picture matched with the current playing time according to the current playing time of the audio, wherein the first plot picture is a picture corresponding to a video plot corresponding to the video at the current playing time;
the picture display module is used for displaying the first plot picture;
wherein the picture determination module is further configured to: before the first plot picture is displayed, processing the first plot picture according to the historical behavior of the user so as to highlight the area corresponding to the content matched with the preference of the user in the first plot picture.
7. The apparatus of claim 6, wherein the picture determination module comprises:
the obtaining sub-module is used for obtaining a scenario picture corresponding to the cached audio data and a playing time interval corresponding to the scenario picture;
and the first determining submodule determines the first plot picture according to the current playing time and the playing time interval corresponding to the plot picture.
8. The apparatus of claim 6, wherein the picture determination module comprises:
the plot determining submodule is used for identifying the content of the audio and determining a video plot corresponding to the current playing time;
and the second determining submodule determines the first plot picture according to the video plot corresponding to the current playing time.
9. The apparatus of claim 8, wherein the picture determination module further comprises:
and the third determining submodule acquires a first plot picture matched with the historical behavior of the user according to the video plot and the historical behavior of the user.
10. The apparatus of claim 6, wherein the first storyline picture comprises any of:
the video system comprises a video play, video frames of the video and pictures generated according to the video play.
11. A visual audio playback device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of any one of claims 1 to 5.
12. A non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method of any of claims 1 to 5.
CN201711453171.XA 2017-12-28 2017-12-28 Visual audio playing method and device Active CN108174269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711453171.XA CN108174269B (en) 2017-12-28 2017-12-28 Visual audio playing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711453171.XA CN108174269B (en) 2017-12-28 2017-12-28 Visual audio playing method and device

Publications (2)

Publication Number Publication Date
CN108174269A CN108174269A (en) 2018-06-15
CN108174269B true CN108174269B (en) 2021-02-26

Family

ID=62518862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711453171.XA Active CN108174269B (en) 2017-12-28 2017-12-28 Visual audio playing method and device

Country Status (1)

Country Link
CN (1) CN108174269B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109905751B (en) * 2019-03-01 2020-12-15 四川长虹电器股份有限公司 Control method and system of intelligent picture frame
CN110390927B (en) * 2019-06-28 2021-11-23 北京奇艺世纪科技有限公司 Audio processing method and device, electronic equipment and computer readable storage medium
CN113316012B (en) * 2021-05-26 2022-03-11 深圳市沃特沃德信息有限公司 Audio and video frame synchronization method and device based on ink screen equipment and computer equipment
CN113391866A (en) * 2021-06-15 2021-09-14 亿览在线网络技术(北京)有限公司 Interface display method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055845A (en) * 2010-11-30 2011-05-11 深圳市五巨科技有限公司 Mobile communication terminal and picture switching method of music player thereof
CN103024490A (en) * 2012-12-26 2013-04-03 北京奇艺世纪科技有限公司 Method and device supporting independent playing of audio and video
CN105979355A (en) * 2015-12-10 2016-09-28 乐视网信息技术(北京)股份有限公司 Method and device for playing video
CN106576151A (en) * 2014-10-16 2017-04-19 三星电子株式会社 Video processing apparatus and method
CN107147928A (en) * 2017-05-23 2017-09-08 努比亚技术有限公司 Play method, terminal and the computer-readable recording medium of video
CN107197362A (en) * 2016-03-15 2017-09-22 广州市动景计算机科技有限公司 A kind of method and device for playing multimedia messages
CN107221347A (en) * 2017-05-23 2017-09-29 维沃移动通信有限公司 Method and terminal that a kind of audio is played

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101994291B1 (en) * 2014-10-14 2019-06-28 한화테크윈 주식회사 Method and Apparatus for providing combined-summary in an imaging apparatus

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055845A (en) * 2010-11-30 2011-05-11 深圳市五巨科技有限公司 Mobile communication terminal and picture switching method of music player thereof
CN103024490A (en) * 2012-12-26 2013-04-03 北京奇艺世纪科技有限公司 Method and device supporting independent playing of audio and video
CN106576151A (en) * 2014-10-16 2017-04-19 三星电子株式会社 Video processing apparatus and method
CN105979355A (en) * 2015-12-10 2016-09-28 乐视网信息技术(北京)股份有限公司 Method and device for playing video
CN107197362A (en) * 2016-03-15 2017-09-22 广州市动景计算机科技有限公司 A kind of method and device for playing multimedia messages
CN107147928A (en) * 2017-05-23 2017-09-08 努比亚技术有限公司 Play method, terminal and the computer-readable recording medium of video
CN107221347A (en) * 2017-05-23 2017-09-29 维沃移动通信有限公司 Method and terminal that a kind of audio is played

Also Published As

Publication number Publication date
CN108174269A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN108093315B (en) Video generation method and device
CN107948708B (en) Bullet screen display method and device
EP3276976A1 (en) Method, apparatus, host terminal, server and system for processing live broadcasting information
CN108259991B (en) Video processing method and device
CN107729522B (en) Multimedia resource fragment intercepting method and device
CN106941624B (en) Processing method and device for network video trial viewing
CN108985176B (en) Image generation method and device
CN107820131B (en) Comment information sharing method and device
CN109947981B (en) Video sharing method and device
CN110519655B (en) Video editing method, device and storage medium
CN108174269B (en) Visual audio playing method and device
EP3147802B1 (en) Method and apparatus for processing information
CN108924644B (en) Video clip extraction method and device
CN108495168B (en) Bullet screen information display method and device
EP3327548A1 (en) Method, device and terminal for processing live shows
CN107122430B (en) Search result display method and device
CN109063101B (en) Video cover generation method and device
CN110234030B (en) Bullet screen information display method and device
CN107147936B (en) Display control method and device for barrage
TW201918859A (en) Interface display method and apparatus
CN108833952B (en) Video advertisement putting method and device
TW201918860A (en) Interface display method and device
CN112291631A (en) Information acquisition method, device, terminal and storage medium
CN109756783B (en) Poster generation method and device
CN108521579B (en) Bullet screen information display method and device

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1253341

Country of ref document: HK

TA01 Transfer of patent application right

Effective date of registration: 20200509

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Applicant before: Youku network technology (Beijing) Co., Ltd

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant