CN109634700A

CN109634700A - A kind of the content of text display methods and terminal device of audio

Info

Publication number: CN109634700A
Application number: CN201811419075.8A
Authority: CN
Inventors: 许午
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2019-04-16

Abstract

The embodiment of the invention provides the content of text display methods and terminal device of a kind of audio, are related to field of communication technology, with solve the problems, such as existing terminal device position audio to be played efficiency it is lower.This method comprises: the voice data of identification target audio file, to obtain target text content；The target text content is shown on audio broadcast interface, the audio broadcast interface is for playing the target audio file；Wherein, in the case where user navigates to the first voice data in the target audio file by the first input triggering terminal equipment, the content of text of first voice data is highlighted.This method can be applied in the scene of using terminal device plays audio file.

Description

A kind of the content of text display methods and terminal device of audio

Technical field

The present embodiments relate to field of communication technology more particularly to the content of text display methods and terminal of a kind of audio Equipment.

Background technique

With will be used wider and wider for terminal device, user can be convenient the ground various audios of using terminal device plays (such as song or recording).

By taking terminal device playback as an example, currently, user can pass through the audio play-back application in terminal device Playback, and user can control the playback progress of recording, such as user can will play by dragging playing progress bar Progress bar drags to the corresponding position of target voice, to play the mesh by audio play-back application with triggering terminal equipment Poster sound.

However, in playback progress of the above-mentioned user by dragging playing progress bar control recording, user possibly can not one It is secondary that playing progress bar is accurately dragged into the corresponding position of target voice, such as user may drag to playing progress bar Target voice position nearby (such as before or after the corresponding position of target voice), then adjusts playback progress when listening The position of item.In this way, user, which may repeatedly drag, could drag to playing progress bar the corresponding position of target voice, to lead The efficiency for causing terminal device to position audio to be played is lower.

Summary of the invention

The embodiment of the present invention provides the content of text display methods and terminal device of a kind of audio, is set with solving existing terminal The lower problem of the standby efficiency for positioning audio to be played.

In order to solve the above-mentioned technical problem, the present invention is implemented as follows:

In a first aspect, it is applied to terminal device the embodiment of the invention provides a kind of content of text display methods of audio, This method comprises: the voice data of identification target audio file, to obtain target text content；It is shown on audio broadcast interface The target text content, the audio broadcast interface is for playing the target audio file；Wherein, pass through the first input touching in user In the case that hair terminal device navigates to the first voice data in the target audio file, in the text of first voice data Appearance highlights.

Second aspect, the embodiment of the invention provides a kind of terminal device, which includes identification module and display Module.The voice data of identification module target audio file for identification, to obtain target text content；Display module is used for The target text content that identification module obtains is shown on audio broadcast interface, the audio broadcast interface is for playing the target audio File；Wherein, the first voice data in the target audio file is navigated to by the first input triggering terminal equipment in user In the case where, the content of text of first voice data highlights.

The third aspect, the embodiment of the invention provides a kind of terminal device, the terminal device include processor, memory and The computer program that can be run on a memory and on a processor is stored, is realized when which is executed by processor The step of stating the content of text display methods of the audio in first aspect.

Fourth aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage mediums Computer program is stored in matter, which realizes when being executed by processor in the text of the audio in above-mentioned first aspect The step of holding display methods.

In embodiments of the present invention, the voice data of target audio file can be identified, to obtain target text content, and The target text content is shown on audio broadcast interface (the audio broadcast interface is for playing the target audio file)；Its In, the case where user navigates to the first voice data in the target audio file by the first input triggering terminal equipment Under, the content of text of first voice data highlights.By taking target audio file is recording file as an example, the embodiment of the present invention The content of text of available recording file simultaneously shows text content, and then user can listen to the same of the recording file When, it is intuitive to see the content of text of recording file, and target is navigated to by the first input triggering terminal equipment in user In the case where the first voice data in audio file, terminal device can be highlighted in the text of first voice data Hold, therefore user can refer to highlighted content of text, quickly navigates to user and it is expected the sound bite listened to, to mention The efficiency that terminal device positions audio to be played is risen.

Detailed description of the invention

Fig. 1 is a kind of configuration diagram of possible Android operation system provided in an embodiment of the present invention；

Fig. 2 is one of the schematic diagram of the content of text display methods of audio provided in an embodiment of the present invention；

Fig. 3 is one of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Fig. 4 is the two of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Fig. 5 is the two of the schematic diagram of the content of text display methods of audio provided in an embodiment of the present invention；

Fig. 6 is the three of the schematic diagram of the content of text display methods of audio provided in an embodiment of the present invention；

Fig. 7 is the three of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Fig. 8 is the four of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Fig. 9 is the four of the schematic diagram of the content of text display methods of audio provided in an embodiment of the present invention；

Figure 10 is the five of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Figure 11 is the six of the interface schematic diagram that the content of text display methods of audio provided in an embodiment of the present invention is applied；

Figure 12 is one of the structural schematic diagram of terminal device provided in an embodiment of the present invention；

Figure 13 is the second structural representation of terminal device provided in an embodiment of the present invention；

Figure 14 is the third structural representation of terminal device provided in an embodiment of the present invention；

Figure 15 is the four of the structural schematic diagram of terminal device provided in an embodiment of the present invention；

Figure 16 is the hardware schematic of terminal device provided in an embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

The terms "and/or" is a kind of incidence relation for describing affiliated partner, indicates may exist three kinds of relationships, For example, A and/or B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.Symbol herein "/" indicates that affiliated partner is relationship such as A/B expression A or B perhaps.

Term " first " and " second " in description and claims of this specification etc. are for distinguishing different pairs As, rather than it is used for the particular order of description object.For example, the first input and the second input etc. are different defeated for distinguishing Enter, rather than the particular order for describing input.

In embodiments of the present invention, " illustrative " or " such as " etc. words for indicate make example, illustration or explanation.This Be described as in inventive embodiments " illustrative " or " such as " any embodiment or design scheme be not necessarily to be construed as comparing Other embodiments or design scheme more preferably or more advantage.Specifically, use " illustrative " or " such as " etc. words purport Related notion is being presented in specific ways.

In the description of the embodiment of the present invention, unless otherwise indicated, the meaning of " plurality " is refer to two or more, For example, multiple processing units refer to two or more processing unit etc..

The embodiment of the present invention provides the content of text display methods and terminal device of a kind of audio, can identify target audio The voice data of file to obtain target text content, and shows the target text content (audio on audio broadcast interface Broadcast interface is for playing the target audio file)；Wherein, the mesh is navigated to by the first input triggering terminal equipment in user In the case where the first voice data in mark with phonetic symbols frequency file, the content of text of first voice data is highlighted.With target sound Frequency file be recording file for, the content of text of the available recording file of the embodiment of the present invention simultaneously shows text content, And then user can be intuitive to see the content of text of recording file, and logical in user while listening to the recording file It crosses in the case that the first input triggering terminal equipment navigates to the first voice data in target audio file, terminal device can be with The content of text of first voice data is highlighted, therefore user can refer to highlighted content of text, quickly position The sound bite listened to it is expected to user, to improve the efficiency that terminal device positions audio to be played.

Terminal device in the embodiment of the present invention can be the terminal device with operating system.The operating system can be Android (Android) operating system can be ios operating system, can also be other possible operating systems, and the present invention is implemented Example is not especially limited.

Below by taking Android operation system as an example, the content of text display side of audio provided in an embodiment of the present invention is introduced Software environment applied by method.

As shown in Figure 1, being a kind of configuration diagram of possible Android operation system provided in an embodiment of the present invention.Scheming In 1, the framework of Android operation system includes 4 layers, be respectively as follows: application layer, application framework layer, system Runtime Library layer and Inner nuclear layer (is specifically as follows Linux inner core).

Wherein, application layer includes each application program (including system application and in Android operation system Tripartite's application program).

Application framework layer is the frame of application program, and developer can be in the exploitation for the frame for abiding by application program In the case where principle, some application programs are developed based on application framework layer.

System Runtime Library layer includes library (also referred to as system library) and Android operation system running environment.Library is mainly Android behaviour As system it is provided needed for all kinds of resources.Android operation system running environment is used to provide software loop for Android operation system Border.

Inner nuclear layer is the operating system layer of Android operation system, belongs to the bottom of Android operation system software level.It is interior Stratum nucleare provides core system service and hardware-related driver based on linux kernel for Android operation system.

By taking Android operation system as an example, in the embodiment of the present invention, developer can be based on above-mentioned Android as shown in Figure 1 The software journey of the content of text display methods of audio provided in an embodiment of the present invention is realized in the system architecture of operating system, exploitation Sequence, so that the content of text display methods of the audio can be run based on Android operation system as shown in Figure 1.Handle Device or terminal device can realize sound provided in an embodiment of the present invention by running the software program in Android operation system The content of text display methods of frequency.

Terminal device in the embodiment of the present invention can be mobile terminal, or immobile terminal.Illustratively, it moves Dynamic terminal can be mobile phone, tablet computer, laptop, palm PC, car-mounted terminal, wearable device, super movement People's computer (ultra-mobile personal computer, UMPC), net book or personal digital assistant (personal Digital assistant, PDA) etc., immobile terminal can be personal computer (personal computer, PC), electricity Depending on machine (television, TV), automatic teller machine or self-service machine etc., the embodiment of the present invention is not especially limited.

The executing subject of the content of text display methods of audio provided in an embodiment of the present invention can set for above-mentioned terminal It is standby, or functional module and/or the function that can be realized the content of text display methods of the audio in the terminal device are real Body can specifically determine that the embodiment of the present invention is not construed as limiting according to actual use demand.It is right below by taking terminal device as an example The content of text display methods of audio provided in an embodiment of the present invention is illustratively illustrated.

As shown in Fig. 2, the embodiment of the present invention provides a kind of content of text display methods of audio, the content of text of the audio Display methods may include following S200-S201.

The voice data of S200, terminal device identification target audio file, to obtain target text content.

In the embodiment of the present invention, above-mentioned target audio file can be the audio text for not including content of text (such as subtitle) Part.Illustratively, which can be recording file, such as lecture recording, interview recording or song are recorded, It can be the audio file of other any possible forms, specifically can determine that the embodiment of the present invention is not according to actual use demand It limits.

In the embodiment of the present invention, terminal device can be known after getting target audio file using conventional voice Other technology identifies the voice data of the target audio file, and according to the voice data of the target audio file, obtains the target The target text content of audio file.

It should be noted that specifically may refer to for the description of above-mentioned conventional speech recognition technology in the prior art The associated description of speech recognition technology, it will not go into details herein.

S201, terminal device the displaying target content of text on audio broadcast interface；Wherein, pass through the first input in user In the case that triggering terminal equipment navigates to the first voice data in target audio file, in the text of first voice data Appearance highlights.

Wherein, which can be used for playing the target audio file.

In the embodiment of the present invention, terminal device obtains target audio file in the voice data of identification target audio file Target text content after, the target text content can be shown on audio broadcast interface.

Optionally, user can input terminal device (such as input to target audio file), be set with triggering terminal The audio play control information of the standby displaying target audio file on audio broadcast interface, then terminal device can be played in audio The audio play control information of displaying target audio file on interface, and the displaying target audio file on audio broadcast interface Target text content, so that user can check the target text content of the target audio file.

Alternatively, user can be to end in terminal device in the case where displaying target audio file on audio broadcast interface End equipment inputs (such as input to " broadcasting " control on audio broadcast interface), with triggering terminal device plays target audio The audio play control information of file, then terminal device can play the audio play control information of target audio file, and The target text content of displaying target audio file on audio broadcast interface, so that user can listen to the target audio file Voice while, check the target text content of the target audio file.

Illustratively, as shown in figure 3, terminal device can be shown in the audio play area 31 on audio broadcast interface 30 The audio play control information of audio file 1 (i.e. above-mentioned target audio file), and the audio on audio broadcast interface 30 The target text content of text filed 32 displaying target audio file.Wherein, the audio play control in audio play area 31 Information may include playing progress bar 33 (such as rectangular band 33 shown in Fig. 3), " broadcasting " control, " pause " control, " stop Only " control, " preservation " control etc..

In the prior art, since audio file does not include content of text, cause user in the feelings for listening to audio file Under condition, the content of text of audio file can not be checked.Compared with the prior art, available audio file of the embodiment of the present invention Content of text simultaneously shows text content, and such user can check the text of audio file according to actual use demand at any time Content, to improve the experience sense of user's using terminal equipment.

Further, in the embodiment of the present invention, user can navigate to target sound by the first input, triggering terminal equipment The first voice data in frequency file.Optionally, which can be user to the playback progress on audio broadcast interface The input of item or first input can be input of the user to the content of text shown on audio broadcast interface, specifically may be used To determine that the embodiment of the present invention is not construed as limiting according to actual use demand.

Optionally, in the embodiment of the present invention, above-mentioned user first input can for click input (such as click input or Double-click input), or dragging input can also be the input of other any possible forms, specifically can be according to actually making It is determined with demand, the embodiment of the present invention is not construed as limiting.

Illustratively, by taking the first input is dragging input as an example, terminal device can be in response to user to target audio text The dragging of the playing progress bar of part inputs, and target audio file is navigated to the first voice data in target audio file.Such as Shown in Fig. 3, it is assumed that terminal device plays the audio file 1 on audio broadcast interface, then user can be by playing audio The playing progress bar 33 of audio play area 31 on interface 30 drags to position 2 from position 1, and triggering terminal equipment navigates to sound Voice data (the first voice in i.e. above-mentioned target audio file corresponding with the position 2 of playing progress bar 33 in frequency file 1 Data), so that terminal device can navigate to the position of (such as broadcasting) with playing progress bar 33 in response to the input of user 2 corresponding voice data.

Illustratively, by taking the first input is clicks input as an example, terminal device is in response to user to target audio file The click of content of text inputs, and target audio file is navigated to the first voice data in target audio file.Such as Fig. 3 institute Show, it is assumed that terminal device plays the audio file 1 on audio broadcast interface, then user can be by clicking audio broadcast interface Any content of text (such as " intelligent behavior ") in the content of text of text filed 32 display of audio on 30, triggering terminal is set It is standby to navigate to voice data (the first voice number in i.e. above-mentioned target audio file corresponding with content of text " intelligent behavior " According to), so that terminal device it is right with content of text " intelligent behavior " can to navigate to (such as broadcasting) in response to the input of user The sound bite answered.

Further, in the embodiment of the present invention, target audio is navigated to by the first input triggering terminal equipment in user In the case where the first voice data in file, terminal device can highlight the content of text of first voice data.From And convenient for user by reference to highlighted content of text, it quickly navigates to user and it is expected the sound bite listened to.

Optionally, above-mentioned highlighted form can be to be shown with highlighted fashion, or it is shown with dynamic-form, It can also be other any possible highlighted forms, can specifically be determined according to actual use demand, the present invention is implemented Example is not construed as limiting.

Illustratively, it is assumed that terminal device shows audio file 1, and audio broadcast interface 30 on audio broadcast interface 30 In playing progress bar 33 the corresponding voice data in position 2, correspond to content of text " reasoning ".As shown in figure 4, if user is logical Cross and playing progress bar 33 dragged into position 2 from position 1, triggering terminal equipment navigate in audio file 1 with playing progress bar The corresponding voice data in position 2, then terminal device can navigate in response to the input of user (such as broadcasting) play into The corresponding voice data in position 2 of item 33 is spent, and is highlighted in the text of (such as font the increase and overstriking) voice data Hold " reasoning ", so that user can refer to highlighted content of text, quickly navigates to user and it is expected the sound bite listened to.

It is again illustrative, with reference to Fig. 4, if user is by clicking the text in the content of text shown on audio broadcast interface 30 This content " reasoning ", triggering terminal equipment navigate to voice data corresponding with text content " reasoning " in audio file 1, that Terminal device can navigate to (such as broadcasting) voice data corresponding with text content 1 in response to the input of user (and can at the position of playing progress bar 33 2 display highlighting), and highlight (such as font increases and overstriking) text Content " reasoning " quickly navigates to user and it is expected the voice sheet listened to so that user can refer to highlighted content of text Section.

In the prior art, since audio file does not include content of text, cause user in the feelings for listening to audio file Under condition, the content of text of audio file can not be checked, and then if a certain voice sheet in the audio file is listened in user's expectation Section, can only heuristically drag playing progress bar, the position of playing progress bar is adjusted when listening, in this way, user may repeatedly drag It is dynamic that playing progress bar could be dragged to the corresponding position of the sound bite, audio to be played is positioned so as to cause terminal device Efficiency is lower.Compared with the prior art, the embodiment of the present invention due to available audio file content of text and show this article This content, and the first voice data in target audio file is navigated to by the first input triggering terminal equipment in user In the case of, terminal device can highlight the content of text of first voice data, therefore user can refer to and highlight Content of text, quickly navigate to user and it is expected the sound bite listened to, so that improving terminal device positions audio to be played Efficiency.

The content of text display methods of audio provided in an embodiment of the present invention can identify the voice number of target audio file According to obtain target text content, and showing that (the audio broadcast interface is used for the target text content on audio broadcast interface Play the target audio file)；Wherein, it is navigated in the target audio file in user by the first input triggering terminal equipment The first voice data in the case where, the content of text of first voice data highlights.It is recording with target audio file For file, the content of text of the available recording file of the embodiment of the present invention simultaneously shows text content, and then user can be with While listening to the recording file, it is intuitive to see the content of text of recording file, and pass through the first input touching in user In the case that hair terminal device navigates to the first voice data in target audio file, terminal device can highlight this The content of text of one voice data, therefore user can refer to highlighted content of text, quickly navigate to user's expectation and listen The sound bite taken, to improve the efficiency that terminal device positions audio to be played.

Optionally, the first language in target audio file is navigated to by the first input triggering terminal equipment in above-mentioned user In the case where sound data, the content of text display methods of audio provided in an embodiment of the present invention can also include following S202.

S202, terminal device start to play the voice content of the first voice data.

In the embodiment of the present invention, the in target audio file is navigated to by the first input triggering terminal equipment in user In the case where one voice data, terminal device can highlight the content of text of first voice data, and start broadcasting One voice data.To which user can be by while listening to the first voice data, with reference to highlighted content of text, fastly Speed navigates to user and it is expected the sound bite listened to, to improve the efficiency that terminal device positions audio to be played.

Optionally, as shown in figure 5, after above-mentioned S200, the content of text of audio provided in an embodiment of the present invention is aobvious Show that method can also include following S203 and S204.

The corresponding time shaft for saving target text content and target audio file of S203, terminal device.

S204, every time play target audio file during, terminal device is according to time shaft and target text content Corresponding relationship, according to the playback progress of the target audio file, highlight in the target text content with the playback progress Corresponding content of text.

Wherein, above-mentioned playback progress can serve to indicate that the position of the current play position of target audio file on a timeline It sets.

In the embodiment of the present invention, terminal device can establish after the target text content for obtaining target audio file Corresponding relationship between the target text content and the time shaft of target audio file, and save the corresponding relationship.Illustratively, Terminal device (such as table of comparisons) can save the corresponding relationship in the form of chained list.

Optionally, in the embodiment of the present invention, terminal device be can establish in the target text content of target audio file The corresponding relationship between time in each text and the time shaft of target audio file.It should be noted that the present invention is implemented Example does not limit the specific method of the corresponding time shaft for saving target text content and target audio file, it will be understood that terminal is set It is standby to save target text content and target audio file in other methods arbitrarily met the actual needs to correspond to Time shaft, specifically can determine that the embodiment of the present invention be not construed as limiting according to actual use demand.

Further, in the embodiment of the present invention, terminal device can be according to text word space adjacent in target text content Time carries out intelligent punctuate to target text content, sentence is spaced apart with sentence, such as terminal device can add between sentence and sentence Add space, comma or fullstop or is made pauses in reading unpunctuated ancient writings in a manner of line feed to realize.In turn, terminal device can save pair comprising punctuate According to table.

In the embodiment of the present invention, the time shaft of above-mentioned target audio file is target audio file in target audio file Total playing duration in temporal information.The position of the time shaft of the target audio file and the position of playing progress bar can be with one One is corresponding.

Illustratively, during playing target audio file every time, if the currently playing position of the target audio file Set the position 1 that (i.e. playback progress) is on time shaft, it is assumed that the position 1 on time shaft corresponds to the position 2 of playing progress bar (referring to above-mentioned Fig. 4), then terminal device can be closed according to the time shaft of target audio file and the corresponding of target text content System determines that content of text corresponding with the position 1 on time shaft is " reasoning " (referring to above-mentioned Fig. 4), and highlights this article This content " reasoning ".

The content of text display methods of audio provided in an embodiment of the present invention, can be prominent aobvious according to above-mentioned corresponding relationship Show content of text corresponding with playback progress in target text content.In this way, user can be while listening to audio file, directly It sees ground and sees content of text corresponding with playback progress in the content of text of audio file, to be conducive to user with reference to prominent aobvious The content of text shown quickly navigates to user and it is expected the sound bite listened to, thus it is to be played to promote terminal device positioning The efficiency of audio.

Optionally, in conjunction with Fig. 2, as shown in fig. 6, after above-mentioned S201, the text of audio provided in an embodiment of the present invention This content display method can also include following S205-S208.

S205, terminal device receive second input of the user to the first control on audio broadcast interface.

S206, terminal device show editor's sub-interface in response to second input on audio broadcast interface.

Wherein, which can be used for playing in above-mentioned target text content with the first of target audio file The corresponding content of text annotation in position, terminal device is broadcast when which can receive the second input for terminal device Put the position of the target audio file.

Optionally, in the embodiment of the present invention, above-mentioned editor's sub-interface may include annotation function, which can wrap Include at least one of following: addition annotation information, deletes annotation information at modification annotation information.

Optionally, in the embodiment of the present invention, above-mentioned annotation information may include at least one of following: text information, picture Information, video information, voice messaging, scribble information.Terminal device can input in response to user to annotation information, addition, The annotation informations such as text, picture, video, voice or scribble in modification and/or deletion editor's sub-interface.

It is appreciated that above-mentioned each annotation information is exemplary and enumerates, i.e., the embodiment of the present invention includes but it is unlimited In the above-mentioned each annotation information enumerated.In actual implementation, above-mentioned annotation information can also include other any possible annotations Information can specifically determine that the embodiment of the present invention is not construed as limiting according to actual use demand.

It should be noted that the embodiment of the present invention is illustratively said for editing sub-interface and including annotation function It is bright, it will be understood that it can also include other any possible editting functions that sub-interface is edited in the embodiment of the present invention, such as modify Content of text etc..The function of editing sub-interface can specifically determine that the embodiment of the present invention is not construed as limiting according to actual use demand.

Optionally, in the embodiment of the present invention, above-mentioned user second input can for click input (such as click input or Double-click input), it is also possible to the input of other any possible forms, can be specifically determined according to actual use demand, the present invention Embodiment is not construed as limiting.Illustratively, the second input of user can be user to the first control (example on audio broadcast interface Such as " annotation " control) click input.

S207, terminal device receive third input of the user in editor's sub-interface.

S208, terminal device are inputted in response to the third, the target position display annotation mark in audio broadcast interface.

Wherein, which may include at least one of following: position and audio where the first content of text are broadcast Putting position corresponding with above-mentioned first play position, first content of text in the playing progress bar in interface can broadcast for audio Putting content of text corresponding with the first play position, the playing progress bar in the target text content shown on interface can be used for Indicate the playback progress of target audio file.

Illustratively, as shown in fig. 7, during playing audio file 1, it is assumed that terminal device receives user couple " annotation " control second input when, terminal device play the audio file 1 position (i.e. the first play position) be play into The position 3 of item is spent, then terminal device can show editor's sub-interface in response to second input on audio broadcast interface 30 34.User can input in editor's sub-interface 34, with triggering terminal equipment in above-mentioned target text content with target sound The corresponding content of text annotation of first play position of frequency file.As shown in figure 8, terminal device can edited in response to user Third input (such as addition annotation information) in sub-interface 34, in the playing progress bar 33 in audio broadcast interface 30 with it is upper The corresponding position of the first play position (i.e. the position 3 of playing progress bar) display annotation mark 35 is stated, and plays position with first Set the position display annotation mark 36 where corresponding content of text (" thinking " as shown in Figure 8).

Optionally, in the embodiment of the present invention, terminal device can be controlled " determination " in editor's sub-interface in response to user Display editor's sub-interface is cancelled in the input of part.Alternatively, if terminal device does not receive the input of user, then in preset duration After preset duration, terminal device cancels display editor's sub-interface.Terminal device cancels the concrete mode of display editor's sub-interface It can be determined according to actual use demand, the embodiment of the present invention is not construed as limiting.

In the embodiment of the present invention, after terminal device cancels display editor's sub-interface, terminal device can be continuously display Annotation mark.The input that terminal device can identify annotation in response to user, display editor's sub-interface and editor's sub-interface Corresponding annotation information is identified including the annotation, and then user can modify or delete the annotation information in editor's sub-interface, Or other annotation information can be added again, with triggering terminal equipment editor's annotation information.

Optionally, in conjunction with Fig. 2, as shown in figure 9, after above-mentioned S201, the text of audio provided in an embodiment of the present invention This content display method can also include following S209-S212.

S209, terminal device receive fourth input of the user to the second control on audio broadcast interface.

S210, terminal device show interception control in response to the 4th input on audio broadcast interface.

Wherein, which is used to intercept the voice data segment in target audio file.

In the embodiment of the present invention, above-mentioned user the 4th input can for click input (such as click input or double-click it is defeated Enter), it is also possible to the input of other any possible forms, can be specifically determined according to actual use demand, the embodiment of the present invention It is not construed as limiting.Illustratively, above-mentioned 4th input can be user to the second control (such as " preservation " on audio broadcast interface Control or " sharing " control) click input.That is, user can by audio broadcast interface " preservation " control or " point Enjoy " input of control, triggering terminal equipment shows that interception control, further terminal device can will be cut on audio broadcast interface The voice data segment of acquisition is taken to save or share.

Optionally, in the embodiment of the present invention, as shown in Figure 10, above-mentioned interception control may include the first interception child control 37 With the second interception child control 38.Terminal device can show the first interception in the playing progress bar 33 in audio broadcast interface 30 Child control 37 and the second interception child control 38, and the corresponding position of the content of text in audio broadcast interface 30 shows first section It takes child control 37 and second to intercept child control 38, so voice data segment can be intercepted according to the actual use demand of user, So improve the flexibility and convenience of human-computer interaction.

S211, terminal device receive fiveth input of the user to interception control.

S212, terminal device save the target voice number by the 5th input triggering interception in response to the 5th input It annotates and identifies according to segment, target text content and target.

Wherein, which is the content of text of the target speech data segment, and target annotation mark can be with Include at least one of the following: annotation corresponding with target speech data segment mark, the target shown in playing progress bar The annotation mark that content of text position is shown.

In the embodiment of the present invention, the 5th input of above-mentioned user can be dragging input, be also possible to other any possibility The input of form can specifically determine that the embodiment of the present invention is not construed as limiting according to actual use demand.Illustratively, such as Figure 10 Shown, the 5th input may include the dragging input to the first interception child control 37, or to the second interception child control 38 Dragging input, or the dragging of the first interception child control 37 and the second interception child control 38 is inputted.In this way, terminal device can be with In response to the input of user, intercepts and save the voice data between the first interception child control 37 and the second interception child control 38 Segment, the content of text of the voice data segment and annotation corresponding with voice data segment mark.

Further, Figure 11 shows audio broadcast interface 40 schematic diagram of the terminal device after executing interception movement. Wherein, audio broadcast interface 40 may include the playing progress bar 41 of the voice data segment acted by interception, the language Annotation corresponding with the voice data segment mark 43 that is shown on the content of text 42 of sound data slot, playing progress bar, with And the annotation mark 44 that text content position is shown.The content of text display methods of audio provided in an embodiment of the present invention It is equally applicable to the sound bite of the audio file 1, it can secondary editor is carried out to the sound bite of audio file 1, specifically Description is referred to the detailed description of above method embodiment, and details are not described herein again.

It should be noted that each attached drawing (such as Fig. 6 and Fig. 9 etc.) in the above embodiment of the present invention is in conjunction with upper State what Fig. 2 was illustrated, when specific implementation, each attached drawing can be combined with other any combinable attached drawings and realize.

As shown in figure 12, the embodiment of the present invention provides a kind of terminal device, which may include identification module 701 With display module 702.The voice data of the target audio file for identification of identification module 701, to obtain target text content.It is aobvious Show module 702 for showing that the target text content that identification module 701 obtains, the audio play boundary on audio broadcast interface Face is for playing the target audio file.Wherein, the target audio is navigated to by the first input triggering terminal equipment in user In the case where the first voice data in file, the content of text of first voice data is highlighted.

Optionally, in conjunction with Figure 12, as shown in figure 13, terminal device provided in an embodiment of the present invention can also include saving mould Block 703.Preserving module 703, for corresponding to and saving in the target text after identification module 701 obtains target text content Hold the time shaft with target audio file.Display module 702 is also used to during playing the target audio file every time, root The corresponding relationship of the time shaft and the target text content for being saved according to preserving module 703, according to broadcasting for the target audio file Degree of putting into highlights content of text corresponding with the playback progress in the target text content.Wherein, which is used for Indicate position of the current play position of the target audio file on the time shaft.

Optionally, in conjunction with Figure 12, as shown in figure 14, terminal device provided in an embodiment of the present invention can also include receiving mould Block 704.Receiving module 704 is used for after display module 702 shows above-mentioned target text content on audio broadcast interface, is connect Receive second input of the user to the first control on audio broadcast interface.Display module 702 is also used in response to receiving module 704 Received second input shows that editor's sub-interface, editor's sub-interface are used for target text on the audio broadcast interface Content of text annotation corresponding with the first play position of the target audio file, first play position are terminal in this content Terminal device plays the position of the target audio file when equipment receives second input.Receiving module 704 is also used to receive Third input of the user in editor's sub-interface that display module 702 is shown.Display module 702 is also used in response to receiving mould The received third input of block 704, the target position display annotation mark in the audio broadcast interface, the target position can be with It includes at least one of the following: in the position where the first content of text and the playing progress bar in audio broadcast interface with The corresponding position of one play position, first content of text be on the audio broadcast interface in the target text content that shows with The corresponding content of text of first play position, the playing progress bar are used to indicate the playback progress of the target audio file.

Optionally, in the embodiment of the present invention, receiving module is also used to show on audio broadcast interface in display module 702 After above-mentioned target text content, fourth input of the user to the second control on the audio broadcast interface is received.Display module 702 are also used to input in response to receiving module 704 the received 4th, and interception control is shown on the audio broadcast interface, should Interception control is used to intercept the voice data segment in the target audio file.Receiving module 704 is also used to receive user to aobvious Show the 5th input of the interception control that module 702 is shown.Preserving module 703 is also used to received in response to receiving module 704 5th input saves target speech data segment, target text content and target by the 5th input triggering interception Annotation mark.Wherein, which is the content of text of the target speech data segment, and target annotation mark includes At least one of below: annotation corresponding with target speech data segment mark, the target shown in the playing progress bar is literary The annotation mark that this content position is shown.

Optionally, in conjunction with Figure 14, as shown in figure 15, terminal device provided in an embodiment of the present invention can also include playing mould Block 705.Playing module 705 can be used for navigating in target audio file in user by the first input triggering terminal equipment In the case where first voice data, start the voice content for playing first voice data.

Terminal device provided in an embodiment of the present invention can be realized terminal device in above method embodiment realize it is each Process, to avoid repeating, which is not described herein again.

Terminal device provided in an embodiment of the present invention can identify the voice data of target audio file, to obtain target Content of text, and show that (the audio broadcast interface is for playing the target sound for the target text content on audio broadcast interface Frequency file)；Wherein, the first voice number in the target audio file is navigated to by the first input triggering terminal equipment in user In the case where, the content of text of first voice data is highlighted.By taking target audio file is recording file as an example, this hair The content of text of the bright available recording file of embodiment simultaneously shows text content, and then user can listen to recording text While part, it is intuitive to see the content of text of recording file, and pass through the first input triggering terminal equipment positioning in user In the case where the first voice data into target audio file, terminal device can highlight the text of first voice data This content, therefore user can refer to highlighted content of text, quickly navigate to user and it is expected the sound bite listened to, from And improve the efficiency that terminal device positions audio to be played.

A kind of hardware structural diagram of Figure 16 terminal device of each embodiment to realize the present invention.As shown in figure 16, The terminal device 800 includes but is not limited to: radio frequency unit 801, network module 802, audio output unit 803, input unit 804, sensor 805, display unit 806, user input unit 807, interface unit 808, memory 809, processor 810, with And the equal components of power supply 811.It will be understood by those skilled in the art that the not structure paired terminal of terminal device structure shown in Figure 16 The restriction of equipment, terminal device may include perhaps combining certain components or different than illustrating more or fewer components Component layout.In embodiments of the present invention, terminal device includes but is not limited to mobile phone, tablet computer, laptop, palm electricity Brain, car-mounted terminal, wearable device and pedometer etc..

Wherein, processor 810, the voice data of target audio file for identification, to obtain target text content；Display Unit 806, for the target text content that the identification of video-stream processor 810 obtains on audio broadcast interface, which is played Interface is for playing the target audio file；Wherein, the target sound is navigated to by the first input triggering terminal equipment in user In the case where the first voice data in frequency file, the content of text of first voice data is highlighted.

The embodiment of the present invention provides a kind of terminal device, which can identify the voice number of target audio file According to obtain target text content, and showing that (the audio broadcast interface is used for the target text content on audio broadcast interface Play the target audio file)；Wherein, it is navigated in the target audio file in user by the first input triggering terminal equipment The first voice data in the case where, the content of text of first voice data highlights.It is recording with target audio file For file, the content of text of the available recording file of the embodiment of the present invention simultaneously shows text content, and then user can be with While listening to the recording file, it is intuitive to see the content of text of recording file, and pass through the first input touching in user In the case that hair terminal device navigates to the first voice data in target audio file, terminal device can highlight this The content of text of one voice data, therefore user can refer to highlighted content of text, quickly navigate to user's expectation and listen The sound bite taken, to improve the efficiency that terminal device positions audio to be played.

It should be understood that the embodiment of the present invention in, radio frequency unit 801 can be used for receiving and sending messages or communication process in, signal Send and receive, specifically, by from base station downlink data receive after, to processor 810 handle；In addition, by uplink Data are sent to base station.In general, radio frequency unit 801 includes but is not limited to antenna, at least one amplifier, transceiver, coupling Device, low-noise amplifier, duplexer etc..In addition, radio frequency unit 801 can also by wireless communication system and network and other set Standby communication.

Terminal device 800 provides wireless broadband internet by network module 802 for user and accesses, and such as helps user It sends and receive e-mail, browse webpage and access streaming video etc..

Audio output unit 803 can be received by radio frequency unit 801 or network module 802 or in memory 809 The audio data of storage is converted into audio signal and exports to be sound.Moreover, audio output unit 803 can also provide and end The relevant audio output of specific function that end equipment 800 executes is (for example, call signal receives sound, message sink sound etc. Deng).Audio output unit 803 includes loudspeaker, buzzer and receiver etc..

Input unit 804 is for receiving audio or video signal.Input unit 804 may include graphics processor (graphics processing unit, GPU) 8041 and microphone 8042, graphics processor 8041 is in video acquisition mode Or the image data of the static images or video obtained in image capture mode by image capture apparatus (such as camera) carries out Reason.Treated, and picture frame may be displayed on display unit 806.Through graphics processor 8041, treated that picture frame can be deposited Storage is sent in memory 809 (or other storage mediums) or via radio frequency unit 801 or network module 802.Mike Wind 8042 can receive sound, and can be audio data by such acoustic processing.Treated audio data can be The format output that mobile communication base station can be sent to via radio frequency unit 801 is converted in the case where telephone calling model.

Terminal device 800 further includes at least one sensor 805, such as optical sensor, motion sensor and other biographies Sensor.Specifically, optical sensor includes ambient light sensor and proximity sensor, wherein ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 8061, and proximity sensor can close when terminal device 800 is moved in one's ear Display panel 8061 and/or backlight.As a kind of motion sensor, accelerometer sensor can detect in all directions (general For three axis) size of acceleration, it can detect that size and the direction of gravity when static, can be used to identify terminal device posture (ratio Such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap)；It passes Sensor 805 can also include fingerprint sensor, pressure sensor, iris sensor, molecule sensor, gyroscope, barometer, wet Meter, thermometer, infrared sensor etc. are spent, details are not described herein.

Display unit 806 is for showing information input by user or being supplied to the information of user.Display unit 806 can wrap Display panel 8061 is included, liquid crystal display (liquid crystal display, LCD), Organic Light Emitting Diode can be used Forms such as (organic light-emitting diode, OLED) configure display panel 8061.

User input unit 807 can be used for receiving the number or character information of input, and generate the use with terminal device Family setting and the related key signals input of function control.Specifically, user input unit 807 include touch panel 8071 and Other input equipments 8072.Touch panel 8071, also referred to as touch screen collect the touch operation of user on it or nearby (for example user uses any suitable objects or attachment such as finger, stylus on touch panel 8071 or in touch panel 8071 Neighbouring operation).Touch panel 8071 may include both touch detecting apparatus and touch controller.Wherein, touch detection Device detects the touch orientation of user, and detects touch operation bring signal, transmits a signal to touch controller；Touch control Device processed receives touch information from touch detecting apparatus, and is converted into contact coordinate, then gives processor 810, receiving area It manages the order that device 810 is sent and is executed.Furthermore, it is possible to more using resistance-type, condenser type, infrared ray and surface acoustic wave etc. Seed type realizes touch panel 8071.In addition to touch panel 8071, user input unit 807 can also include other input equipments 8072.Specifically, other input equipments 8072 can include but is not limited to physical keyboard, function key (such as volume control button, Switch key etc.), trace ball, mouse, operating stick, details are not described herein.

Further, touch panel 8071 can be covered on display panel 8061, when touch panel 8071 is detected at it On or near touch operation after, send processor 810 to determine the type of touch event, be followed by subsequent processing device 810 according to touching The type for touching event provides corresponding visual output on display panel 8061.Although in Figure 16, touch panel 8071 and aobvious Show that panel 8061 is the function that outputs and inputs of realizing terminal device as two independent components, but in some embodiments In, can be integrated by touch panel 8071 and display panel 8061 and realize the function that outputs and inputs of terminal device, it is specific this Place is without limitation.

Interface unit 808 is the interface that external device (ED) is connect with terminal device 800.For example, external device (ED) may include having Line or wireless head-band earphone port, external power supply (or battery charger) port, wired or wireless data port, storage card end Mouth, port, the port audio input/output (I/O), video i/o port, earphone end for connecting the device with identification module Mouthful etc..Interface unit 808 can be used for receiving the input (for example, data information, electric power etc.) from external device (ED) and By one or more elements that the input received is transferred in terminal device 800 or can be used in 800 He of terminal device Data are transmitted between external device (ED).

Memory 809 can be used for storing software program and various data.Memory 809 can mainly include storing program area The storage data area and, wherein storing program area can (such as the sound of application program needed for storage program area, at least one function Sound playing function, image player function etc.) etc.；Storage data area can store according to mobile phone use created data (such as Audio data, phone directory etc.) etc..In addition, memory 809 may include high-speed random access memory, it can also include non-easy The property lost memory, a for example, at least disk memory, flush memory device or other volatile solid-state parts.

Processor 810 is the control centre of terminal device, utilizes each of various interfaces and the entire terminal device of connection A part by running or execute the software program and/or module that are stored in memory 809, and calls and is stored in storage Data in device 809 execute the various functions and processing data of terminal device, to carry out integral monitoring to terminal device.Place Managing device 810 may include one or more processing units；Optionally, processor 810 can integrate application processor and modulatedemodulate is mediated Manage device, wherein the main processing operation system of application processor, user interface and application program etc., modem processor is main Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 810.

Terminal device 800 can also include the power supply 811 (such as battery) powered to all parts, optionally, power supply 811 Can be logically contiguous by power-supply management system and processor 810, to realize management charging by power-supply management system, put The functions such as electricity and power managed.

In addition, terminal device 800 includes some unshowned functional modules, details are not described herein.

Optionally, the embodiment of the present invention also provides a kind of terminal device, including processor 810 as shown in figure 16, storage Device 809 is stored in the computer program that can be run on memory 809 and on processor 810, and the computer program is by processor 810 realize each process of the content of text display methods embodiment of above-mentioned audio when executing, and can reach identical technology effect Fruit, to avoid repeating, which is not described herein again.

The embodiment of the present invention also provides a kind of computer readable storage medium, and meter is stored on computer readable storage medium Calculation machine program, the computer program realize each of the content of text display methods embodiment of above-mentioned audio when being executed by processor Process, and identical technical effect can be reached, to avoid repeating, which is not described herein again.Wherein, the computer readable storage medium May include read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic or disk etc..

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute method disclosed in each embodiment of the present invention.

The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form belongs within protection of the invention.

Claims

1. a kind of content of text display methods of audio is applied to terminal device, which is characterized in that the described method includes:

The voice data of target audio file is identified, to obtain target text content；

The target text content is shown on audio broadcast interface, the audio broadcast interface is for playing the target audio File；

Wherein, the terminal device is triggered by the first input in user and navigates to the first voice in the target audio file In the case where data, the content of text of first voice data is highlighted.

2. the method according to claim 1, wherein it is described obtain target text content after, the method is also Include:

The corresponding time shaft for saving the target text content and the target audio file；

During playing the target audio file every time, according to the correspondence of the time shaft and the target text content Relationship, according to the playback progress of the target audio file, highlight in the target text content with the playback progress Corresponding content of text；

Wherein, the playback progress is used to indicate the position of the current play position of the target audio file on the time axis It sets.

3. according to the method described in claim 2, it is characterized in that, described show the target text on audio broadcast interface After content, the method also includes:

Receive second input of the user to the first control on the audio broadcast interface；

In response to second input, show that editor's sub-interface, editor's sub-interface are used on the audio broadcast interface Content of text corresponding with the first play position of the target audio file in the target text content is annotated, described the One play position is that the terminal device receives terminal device broadcasting target audio file when the described second input Position；

Receive third input of the user in editor's sub-interface；

It is inputted in response to the third, the target position display annotation mark in the audio broadcast interface, the target position Set the position where including at least one of the following: the first content of text and the playing progress bar in the audio broadcast interface Upper position corresponding with first play position, first content of text are described to show on the audio broadcast interface Content of text corresponding with first play position, the playing progress bar are used to indicate the target in target text content The playback progress of audio file.

4. according to the method described in claim 3, it is characterized in that, described show the target text on audio broadcast interface After content, the method also includes:

Receive fourth input of the user to the second control on the audio broadcast interface；

In response to the 4th input, interception control is shown on the audio broadcast interface, the interception control is for intercepting Voice data segment in the target audio file；

Receive fiveth input of the user to the interception control；

In response to the 5th input, the target speech data segment by the 5th input triggering interception, target text are saved This content and target annotation mark；

Wherein, the target text content is the content of text of the target speech data segment, the target annotation mark packet Include at least one of following: the annotation corresponding with the target speech data segment shown in the playing progress bar identifies, institute State the annotation mark that target text content position is shown.

5. method according to claim 1 to 4, which is characterized in that trigger institute by the first input in user In the case where stating the first voice data that terminal device navigates in the target audio file, the method also includes:

Start to play the voice content of first voice data.

6. a kind of terminal device, which is characterized in that the terminal device includes identification module and display module；

The identification module, the voice data of target audio file for identification, to obtain target text content；

The display module, the target text content obtained for showing the identification module on audio broadcast interface, The audio broadcast interface is for playing the target audio file；

7. terminal device according to claim 6, which is characterized in that the terminal device further includes preserving module；

The preserving module, for corresponding to and saving the target text after the identification module obtains target text content The time shaft of content and the target audio file；

The display module is also used to during playing the target audio file every time, is protected according to the preserving module The corresponding relationship of the time shaft and the target text content deposited is dashed forward according to the playback progress of the target audio file Content of text corresponding with the playback progress in the target text content is shown out；Wherein, the playback progress is for referring to Show the position of the current play position of the target audio file on the time axis.

8. terminal device according to claim 7, which is characterized in that the terminal device further includes receiving module；

The receiving module, for after the display module shows the target text content on audio broadcast interface, Receive second input of the user to the first control on the audio broadcast interface；

The display module is also used to play boundary in the audio in response to received second input of the receiving module Shown on face editor sub-interface, the editor sub-interface for in the target text content with the target audio file First play position corresponding content of text annotation, it is defeated that first play position is that the terminal device receives described second The fashionable terminal device plays the position of the target audio file；

It is defeated to be also used to receive third of the user in editor's sub-interface that the display module is shown for the receiving module Enter；

The display module is also used to play boundary in the audio in response to the received third input of the receiving module Target position display annotation mark in face, the target position includes at least one of the following: the position where the first content of text Set and the audio broadcast interface in playing progress bar on position corresponding with first play position, described first Content of text is corresponding with first play position in the target text content shown on the audio broadcast interface Content of text, the playing progress bar are used to indicate the playback progress of the target audio file.

9. terminal device according to claim 8, which is characterized in that

The receiving module, be also used to show on audio broadcast interface in the display module target text content it Afterwards, fourth input of the user to the second control on the audio broadcast interface is received；

The display module is also used to play boundary in the audio in response to received 4th input of the receiving module Show that interception control, the interception control are used to intercept the voice data segment in the target audio file on face；

The receiving module is also used to receive the 5th input of the interception control that user shows the display module；

The preserving module is also used to save in response to received 5th input of the receiving module and pass through the described 5th Target speech data segment, target text content and the target annotation mark of input triggering interception；

10. terminal device according to any one of claims 6 to 9, which is characterized in that the terminal device further includes broadcasting Amplification module；

The playing module navigates to the target audio file for triggering the terminal device by the first input in user In the first voice data in the case where, start the voice content for playing first voice data.