CN109325147A

CN109325147A - A kind of information processing method and device

Info

Publication number: CN109325147A
Application number: CN201811162230.2A
Authority: CN
Inventors: 赵海娇
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2018-09-30
Filing date: 2018-09-30
Publication date: 2019-02-12

Abstract

This application discloses a kind of information processing method and devices, wherein the described method includes: determining content to be output；The content to be output is played according to determining play mode；Wherein, the play mode includes at least first mode and second mode, in the first mode, exports corresponding first audio of the content to be output when playing the content to be output；In the second mode, the second audio is exported when playing the content to be output, second audio handles first audio and obtains.

Description

A kind of information processing method and device

Technical field

This application involves technical field of information processing, and in particular to a kind of information processing method and device.

Background technique

Currently, electronic equipment only supports the list according to original audio or video playing when playing audio or video file One play mode, user are easy to cause to puzzle to user without other choices, for example, for language obstacle person or study For person, understand that original audio or video may be more painstaking.

Summary of the invention

In view of this, the application provides a kind of information processing method and device, can be played according to different play mode Different audio out helps to improve the usage experience of user.

In order to achieve the above objectives, the technical solution of the application is achieved in that

The embodiment of the present application provides a kind of information processing method, comprising:

Determine content to be output；

The content to be output is played according to determining play mode；

Wherein, the play mode includes at least first mode and second mode,

In the first mode, corresponding first sound of the content to be output is exported when playing the content to be output Frequently；

In the second mode, the second audio is exported when playing the content to be output, second audio is processing The audio that first audio obtains.

In above scheme, optionally, in the second mode, the second audio is exported when playing the content to be output, Include:

Identify that the first audio obtains the first text or obtains the first text in content to be output；

It handles first text and obtains the second text；

According to second audio of the second text generation.

In above scheme, optionally, the first text of processing obtains the second text and includes:

First text corresponds to first language parameter, and the second text corresponds to second language parameter；

Or

First text and the second text correspond to same-language parameter, language skills parameter and the second text of the first text Language skills parameter is different.

In above scheme, optionally, handles first text and obtains the second text, comprising:

Identify that the first audio obtains the first audio in the first text and corresponds to first language parameter；

According to second audio of the second text generation, comprising:

First text and the second text are identical；

Second language parameter is corresponded to according to the second audio in second audio of the second text generation.

In above scheme, optionally, the first audio of identification obtains the first text and includes:

Corresponding first text of the content of first audio, obtains the auxiliary information of the first audio；

Include: according to second audio of the second text generation

Using the auxiliary information, according to second audio of the second text generation.

In above scheme, optionally, the method also includes:

Determine the target object in the content to be output, wherein the target object is the object that need to be interpreted；

Extract target audio corresponding with the target object and/or target text；

Determining at least one related text to match with the target text, and/or, match with the target audio At least one related audio；

Based on the target audio, the target text and at least one described related text, and/or, based on described Target text, the target audio and at least one described related audio, obtain corresponding relationship to be output；

When meeting output condition, the corresponding relationship is exported in a preset form.

In above scheme, optionally, when meeting output condition, the corresponding relationship is exported in a preset form, comprising:

When receiving trigger action, the corresponding relationship is exported in a preset form；Or

At the end of the content to be output plays, the corresponding relationship is exported automatically in a preset form.

The embodiment of the present application provides a kind of information processing unit, comprising:

Memory, for storing content to be output；

Processor, for playing the content to be output according to determining play mode；Wherein, the play mode is at least Including first mode and second mode, in the first mode, exported when playing the content to be output described to be output interior Hold corresponding first audio；In the second mode, the second audio, second sound are exported when playing the content to be output Frequency is the audio that processing first audio obtains.

In above scheme, optionally, the processor is also used to: determining the target object in content to be output, wherein The target object is the object that need to be interpreted；Extract target audio corresponding with the target object and target text；It determines At least one related text to match with the target text, and/or, at least one phase to match with the target audio Close audio；Based on the target audio, the target text and at least one described related text, and/or, the target text Originally, the target audio and at least one described related audio, obtain corresponding relationship to be output；

Described device further include:

Follower, for exporting the corresponding relationship in a preset form when meeting output condition.

Determination unit, for determining content to be output；

Broadcast unit, for playing the content to be output according to determining play mode；

Wherein, the play mode includes at least first mode and second mode,

The embodiment of the present application also provides a kind of computer storage medium, calculating is stored in the computer storage medium Machine executable instruction, the computer executable instructions are for executing information processing method described in the embodiment of the present application.

Using the technical solution of the embodiment of the present application, content to be output is determined；According to determining play mode broadcasting Content to be output；Wherein, the play mode includes at least first mode and second mode, in the first mode, plays The content to be output corresponding first audio is exported when the content to be output；In the second mode, play it is described to The second audio is exported when exporting content, second audio is the audio that processing first audio obtains；In this way, supporting bimodulus Formula plays, and different audios can be played out according to different play mode, helps to improve the usage experience of user.

Detailed description of the invention

Fig. 1 is the implementation process schematic diagram one of information processing method provided by the embodiments of the present application；

Fig. 2 is the implementation process schematic diagram two of information processing method provided by the embodiments of the present application；

Fig. 3 is the structural schematic diagram one of information processing unit provided by the embodiments of the present application；

Fig. 4 is the structural schematic diagram two of information processing unit provided by the embodiments of the present application.

Specific embodiment

The characteristics of in order to more fully hereinafter understand the application and technology contents, with reference to the accompanying drawing to the reality of the application It is now described in detail, appended attached drawing purposes of discussion only for reference, is not used to limit the application.

The technical solution of the application is further elaborated in the following with reference to the drawings and specific embodiments.

Embodiment one

A kind of information processing method is present embodiments provided, as shown in Figure 1, the information processing method mainly includes following Step:

Step 101: determining content to be output.

In the present embodiment, the content to be output can refer to audio-frequency information to be output, can also refer to video to be output Information.

In the present embodiment, the content to be output can be preset, and can also be determined in real time according to playback progress.

In some alternative embodiments, determination content to be output, comprising:

The content in default file in preset time period is buffered in advance；

Content in the preset time period is determined as content to be output.

Wherein, the default file is either video file, is also possible to audio file.

Here, the corresponding duration of the preset time period can be preset, a length of T, the numerical value of specific T when as described in setting It can also be set or be adjusted according to actual use situation or user demand or design requirement.That is, determining that T's is initial After time point, the content of the default file within the scope of when buffering a length of T time.

Illustratively, broadcasting application is installed, the broadcasting application supports multi-mode to play in terminal；The broadcasting application When receiving the request for playing certain default file of user's input, play in application as described in receiving user's click about broadcasting When the operation of the shortcut key of the default file, content to be output is determined based on the operation.

Step 102: the content to be output is played according to determining play mode；Wherein, the play mode is at least wrapped First mode and second mode are included, in the first mode, exports the content to be output when playing the content to be output Corresponding first audio；In the second mode, the second audio, second audio are exported when playing the content to be output It is the audio that processing first audio obtains.

Here, the first mode can be understood as normal mode, and under the normal mode, it is corresponding to play default file Original audio.

That is, in the first mode, do not modify to the corresponding original audio of the content to be output.

Here, the second mode can be understood as special pattern, under the special pattern, play to default file pair The original audio answered carried out the audio of processing.

That is, in the second mode, need first to modify to the corresponding original audio of the content to be output, Then modified audio is played.

In some optional embodiments, in the second mode, the second sound is exported when playing the content to be output Frequently, comprising:

It handles first text and obtains the second text；

According to second audio of the second text generation.

That is, obtain the first text mode at least there are two types of, one is and identifying to the first audio The first text is obtained, another kind is directly to obtain the first text from content to be output.

In first kind optional embodiment, the first text of processing obtains the second text and includes:

First text corresponds to first language parameter, and the second text corresponds to second language parameter.

That is, by certain processing mode as translated, so that languages change, as English becomes French.

Illustratively, the first text corresponds to language as Chinese, and the second text corresponds to language as English, and the first text is corresponding First audio is Chinese audio, and by interpretative system, the first text of Chinese form is first translated as to the second text of English form This, then according to the second text generation English audio of English form.

In the second class optional embodiment, the first text of processing obtains the second text and includes:

In practical application, the language skills parameter is the conversion parameter characterized from simple to complex, or by complexity to Simple conversion parameter.

It is described here it is simple or complicated be to consider in angle from understanding for user, naturally it is also possible to be the note from user Recall and considers in other levels such as angle.

Illustratively, the language skills parameter includes but is not limited to read aloud skill angle, as liaison, contracting are read or turn to read Deng, further include spelling skill angle, such as abbreviation or spelling, word change.

That is, in the case where languages do not change, by certain disposal skill as split, merging, word Variation etc., so that the representation of text changes, if spell mode is changed, so as to cause the change of bright read mode Change.

Illustratively, the first text corresponds to language as Chinese, and the second text corresponds to language as Chinese, and the first text is corresponding First audio is Chinese audio, by word proofreading method, first by the first text proofreading of Chinese form is Chinese form the Two texts, then according to the second text generation Chinese audio.

Illustratively, the first text corresponds to language as English, and the second text corresponds to language as English, and the first text is corresponding First audio is English audio, by spelling unified approach, first by the first text proofreading of English form is English form the Two texts, then according to the second text generation English audio.

In the following, being illustrated so that language skills parameter is the conversion parameter characterized from simple to complex as an example.

For example, " the I am " in the first text is expressed as " I ' m " in the second text after treatment, that is, is spelling Shi Jinhang write the two or more syllables of a word together, to need to carry out liaison when reading aloud the phrase.Obviously, " I am " for " I ' m ", from user Understand and seen in angle level, is from simple to complicated transformation.

Again for example, " the going to " in the first text is expressed as " gonna " in the second text after treatment, It is abridged in spelling, to need to carry out contracting reading when reading aloud the phrase.Obviously, " going to " relative to For " gonna ", understand in angle level from user, is from simple to complicated transformation.

Again for example, " felid " in the first text is expressed as " food meat cat in the second text after treatment Section animal " has carried out word variation in description, to carry out turning to read when reading aloud the phrase.Obviously, " cat family is dynamic Object " understands in angle level for " food meat felid " from user, is from simple to complicated transformation.

In the following, being illustrated so that language skills parameter is characterized by complexity to simple conversion parameter as an example.

For example, " sapphire blue " in the first text is expressed as " blue " in the second text after treatment, that is, exists Word variation is carried out when description, to carry out turning to read when reading aloud the phrase.If the first text is used with the second text For English text, it is clear that " sapphire blue " is increasingly complex relative to " blue ", understands in angle level from user, is from complexity To simple transformation.

In third class optional embodiment, handles first text and obtains the second text, comprising:

Correspondingly, according to second audio of the second text generation, comprising:

First text and the second text are identical；

That is, text does not change, but the first text corresponds to first language parameter and the second text is (practical For the first text) corresponding second language parameter is different.

In practical application, the language parameter can refer to the different language of same country.

For example, first language parameter is Mandarin such as standard Chinese, and second language parameter is that dialect such as Henan is talked about.

Again for example, first language parameter is the Chinese authority's language such as standard Chinese, and second language parameter is China Minority language such as Chinese Uyger language.

In practical application, the language parameter can also refer to the different language of country variant.

For example, first language parameter is Chinese style Chinese, and second language parameter is Anglicism English.

That is, such as directly being translated by certain processing mode, based on the first text so that language becomes Change, such as Chinese Chinese becomes Americanese.

In some optional embodiments, the first audio of identification obtains the first text and includes:

Include: according to second audio of the second text generation

Here, the auxiliary information includes tone information, prosody information etc..

Here, the first text can be the corresponding different texts of same language from the second text, be also possible to not of the same race The corresponding text of language.

In this way, making the second audio identical as the tone of the first audio or intonation.

That is, parameter is read aloud in simulation the first audio (original audio), to read aloud the second text (updated text Content).

Illustratively, the first text corresponds to language as Chinese, and it is Korean that the second text, which corresponds to language, and the first text is corresponding First audio is Chinese audio, and corresponding second audio of the second text is Korean audio；Simulate the first audio (Chinese audio) Parameter is read aloud, to read aloud the second text (Korean text), so that the second audio is identical as the tone of the first audio or intonation.

The information processing technology scheme of the embodiment of the present application, determines content to be output；It is played according to determining play mode The content to be output；Wherein, the play mode includes at least first mode and second mode, in the first mode, The content to be output corresponding first audio is exported when playing the content to be output；In the second mode, institute is played The second audio is exported when stating content to be output, second audio is the audio that processing first audio obtains；In this way, supporting Double mode plays, and it is more that type may be selected in user；For content to be output, can be played out according to different play mode different Audio helps to improve the usage experience of user.

Embodiment two

Based on technical solution described in embodiment one, as shown in Fig. 2, the information processing method can also further comprise with Lower step:

Step 103: determining the target object in the content to be output, wherein the target object is pair that need to be interpreted As.

Here, the target object can be all or part of the content of the content to be output.

In some optional embodiments, when the target object is text, the mesh in the content to be output is determined Mark object, comprising:

Buffer the text information in content to be output；

The target text in content to be output is determined based on the text information.

In some optional embodiments, when the target object is audio, the mesh in the content to be output is determined Mark object, comprising:

Buffer the audio-frequency information in content to be output；

The target audio in content to be output is determined based on the audio-frequency information.

Step 104: extracting target audio corresponding with the target object and/or target text.

In some optional embodiments, target audio corresponding with the target object is extracted, comprising:

After determining the target text in content to be output, original audio is cut to obtain based on the target text Target audio.

It so, it is possible to obtain target audio corresponding with target text, provide data branch to be subsequently generated corresponding relationship Support.

In some optional embodiments, target text corresponding with the target object is extracted, comprising:

After determining the target audio in content to be output, urtext is decomposed to obtain based on the target audio Target text.

It so, it is possible to obtain target text corresponding with target audio, provide data branch to be subsequently generated corresponding relationship Support.

Step 105: determining at least one related text to match with the target text, and/or, with the target sound At least one related audio that frequency matches.

In some optional embodiments, determining at least one related text to match with the target text, comprising:

Search from local data library lookup or by third party software at least one being adapted with the target text Related text, at least one described related text and the target audio are adapted.

Here, the related text is to target text by obtaining after certain disposal skill processing.

For example, disposal skill include in text word or sentence split, merged, changed, translated.

For example, the related text, which can be, carries out translation post-processing to target text and obtains, and can also be pair It is obtained after target text progress word or sentence transformation describing mode.

In some optional embodiments, determining at least one related audio to match with the target audio, comprising:

At least one being adapted from local data library lookup or by third party software and the target audio is related Audio, at least one described related audio and the target text are adapted.

Here, the related audio is to target audio by obtaining after certain disposal skill processing.

For example, disposal skill include in text word or sentence carry out liaison, contracting read, turn read, translation etc..

For example, the related audio, which can be, carries out translation post-processing to target audio and obtains, and can also be pair Target text carries out word or sentence and converts to obtain after bright read mode.

Step 106: it is based on the target audio, the target text and at least one described related text, and/or, Based on the target text, the target audio and at least one described related audio, corresponding relationship to be output is obtained.

In this way, user can not only know the original corresponding relationship of target text and target audio by corresponding relationship, also It can know at least one related text relevant to the target text, the deep understanding convenient for user to the target text.Similarly, User can not only know the original corresponding relationship of target text and target audio by corresponding relationship, moreover it is possible to know and the target At least one relevant related audio of audio, the deep understanding convenient for user to the target audio.

Further, the method also includes:

Obtain alternate audio corresponding with the target text；

During playing the content to be output, target audio is replaced with the alternate audio in corresponding position.

For example, the target audio is the audio read aloud with first language parameter, the replacement audio is with second language The audio that parameter is read aloud.

In this way, can export in video or audio file with the more matched audio of its text.

Further, the method also includes:

Obtain substitution text corresponding with the target audio；

During playing the content to be output, the target text is replaced with the substitution text in corresponding position This.

For example, the target text is the text write with first language, the replacement text is to be write with second language Text.

In this way, can export in video or audio file with the more matched text of its audio.

Step 107: when meeting output condition, exporting the corresponding relationship in a preset form.

In some alternative embodiments, when meeting output condition, the corresponding relationship is exported in a preset form, is wrapped It includes:

When receiving trigger action, the corresponding relationship is exported in a preset form.

Here, the trigger action refers to the operation input by user for being used to indicate output corresponding relationship.Such as user Operation or user's input about the shortcut key of output corresponding relationship are clicked about the phonetic order for exporting corresponding relationship.

Illustratively, the first application is installed, multi-mode playing function is supported in first application in terminal；Using the In one application default file playing process, or after broadcasting, before even playing, it is defeated that first application receives user It is defeated about being used to indicate in the first application as described in receiving user's click when what is entered is used to indicate the request of output corresponding relationship Out when the operation of the shortcut key of corresponding relationship, corresponding relationship is exported based on the operation.Certainly, defeated in the different phase of broadcasting Corresponding relationship out may be identical or different.

Optionally, the presets include but is not limited to tabular form, magic lantern sheet form.

Here, the corresponding relationship is related to knowledge point summary or analysis, is obtained automatically by system, consults for user, with Just user learns.

In this way, exporting corresponding relationship automatically after default file plays, better knowledge can be provided for user and summarized, Knowledge point is readily available during watching video or listening to audio convenient for user to summarize.

For example, first the caption information in video is done and is buffered in advance, then automatically extract the information in subtitle, Extract liaison, voiceless consonant omission, phrase, slang etc.；Cutting reading and subtitling image progress are carried out to audio by subtitle Match；After playback ends, it is presented to viewer with the mode for summarizing list, while is parsed to part is summarized.In this way, for Know that knowledge is summarized by seeing that the people of video foreign language studying gives preferably study, this scheme can also be used in teaching, be Student builds best context, blends learning in music.

The information processing technology scheme of the embodiment of the present application, determines the target object in content to be output, wherein the mesh Marking object is the object that need to be interpreted；Extract target audio corresponding with the target object and target text；It is determining with it is described At least one related text that target text matches, and/or, at least one the related sound to match to the target audio Frequently；Based on the target audio, the target text and at least one described related text, and/or, the target text, The target audio and at least one described related audio, obtain corresponding relationship to be output；When meeting output condition, with Presets export the corresponding relationship；In this way, knowledge point can be summarized for user, in viewing video or audio is listened to convenient for user During easily obtain the knowledge point of system summary, blend learning in music, help to improve the usage experience of user.

Embodiment three

A kind of information processing unit is present embodiments provided, is applied to electronic equipment, as shown in figure 3, described device includes:

Memory 10, for storing content to be output；

Processor 20, for playing the content to be output according to determining play mode；Wherein, the play mode is extremely It less include that first mode and second mode export described to be output when playing the content to be output in the first mode Corresponding first audio of content；In the second mode, the second audio is exported when playing the content to be output, described second Audio is the audio that processing first audio obtains.

As an alternative embodiment, the processor 20, is also used to:

It handles first text and obtains the second text；

According to second audio of the second text generation.

In some optional embodiments, the first text corresponds to first language parameter, and the second text corresponds to second language ginseng Number.

In some optional embodiments, the first text and the second text correspond to same-language parameter, the language of the first text Say that skill parameter is different from the language skills parameter of the second text.

As an alternative embodiment, the processor 20, is also used to:

Identify that the first audio obtains the first audio in the first text and corresponds to first language parameter；First text and the second text This is identical；

As an alternative embodiment, the processor 20, is also used to:

Obtain the auxiliary information of the first audio, corresponding first text of the content of the first audio；

Using the auxiliary information, according to second the second audio of text generation.

In above scheme, optionally, the processor 20 is also used to: determining the target object in content to be output, In, the target object is the object that need to be interpreted；Extract target audio corresponding with the target object and target text；Really Fixed at least one related text to match with the target text, and/or, at least one to match with the target audio Related audio；Based on the target audio, the target text and at least one described related text, and/or, the target Text, the target audio and at least one described related audio, obtain corresponding relationship to be output.

Further, described device further include:

Follower 30, for exporting the corresponding relationship in a preset form when meeting output condition.

As an alternative embodiment, the follower 30, is also used to:

Information processing unit described in the present embodiment can support double mode to play, and can be played according to different play mode Different audio out helps to improve the usage experience of user；Corresponding relationship can also be exported, as user summarizes knowledge point, just The knowledge point of system summary is easily obtained during watching video or listening to audio in user.

Example IV

A kind of information processing unit is present embodiments provided, is applied to electronic equipment, as shown in figure 4, described device includes:

Determination unit 41, for determining content to be output；

Broadcast unit 42, for playing the content to be output according to determining play mode；

Wherein, the play mode includes at least first mode and second mode,

As an alternative embodiment, being also used in the broadcast unit 42:

It handles first text and obtains the second text；

According to second audio of the second text generation.

As an alternative embodiment, being also used in the broadcast unit 42:

Further, described device further include: control unit 43, described control unit 43 are used for:

Extract target audio corresponding with the target object and/or target text；

As an alternative embodiment, described control unit 43, is also used to:

It should be understood that information processing unit provided by the above embodiment is when playing out control, only with above-mentioned each The division progress of program module can according to need for example, in practical application and distribute above-mentioned processing by different journeys Sequence module is completed, i.e., the internal structure of server is divided into different program modules, with complete it is described above whole or Part is handled.In addition, information processing unit provided by the above embodiment and information processing method embodiment belong to same design, Specific implementation process is detailed in embodiment of the method, and which is not described herein again.

In the present embodiment, determination unit 41, broadcast unit 42 and control unit 43 in the information processing unit, in reality Border application in can by the display device or the display device in the electronic device central processing unit (CPU, Central Processing Unit), digital signal processor (DSP, Digital Signal Processor), microcontroller Unit (MCU, Microcontroller Unit) or programmable gate array (FPGA, Field-Programmable Gate ) etc. Array realize.

Embodiment five

Computer storage medium provided in this embodiment, is stored thereon with computer instruction, which is executed by processor Shi Shixian: content to be output is determined；The content to be output is played according to determining play mode；Wherein, the play mode Including at least first mode and second mode, in the first mode, output is described to defeated when playing the content to be output Corresponding first audio of content out；In the second mode, the second audio is exported when playing the content to be output, described the Two audios processing, first audio obtains.

It will be appreciated by those skilled in the art that in the computer storage medium of the present embodiment each program function, can refer to The associated description of information processing method described in foregoing embodiments and understand, details are not described herein.

In several embodiments provided herein, it should be understood that disclosed server and method can pass through Other modes are realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only For a kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, Or it is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition Partially mutual coupling or direct-coupling or communication connection can be through some interfaces, the indirect coupling of equipment or unit It closes or communicates to connect, can be electrical, mechanical or other forms.

Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists In member；Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, each functional unit in each embodiment of the application can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units；It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed；And storage medium above-mentioned include: movable storage device, it is read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or The various media that can store program code such as person's CD.

If alternatively, the above-mentioned integrated unit of the application is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the application is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with Personal computer, server or network equipment etc.) execute each embodiment the method for the application all or part. And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code Medium.

The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.Therefore, the protection scope of the application should be based on the protection scope of the described claims.

Claims

1. a kind of information processing method, comprising:

Determine content to be output；

The content to be output is played according to determining play mode；

Wherein, the play mode includes at least first mode and second mode,

In the first mode, corresponding first audio of the content to be output is exported when playing the content to be output；

In the second mode, the second audio is exported when playing the content to be output, second audio is described in processing The audio that first audio obtains.

2. according to the method described in claim 1, wherein, in the second mode, being exported when playing the content to be output Second audio, comprising:

It handles first text and obtains the second text；

According to second audio of the second text generation.

3. according to the method described in claim 2, wherein, the first text of processing obtains the second text and includes:

Or

First text and the second text correspond to same-language parameter, the language skills parameter of the first text and the language of the second text Skill parameter is different.

4. according to the method described in claim 2, wherein, handling first text and obtaining the second text, comprising:

According to second audio of the second text generation, comprising:

First text and the second text are identical；

5. according to the method described in claim 2, wherein,

Identify that the first audio obtains the first text and includes:

Include: according to second audio of the second text generation

6. according to the method described in claim 1, wherein, the method also includes:

Extract target audio corresponding with the target object and/or target text；

Determining at least one related text to match with the target text, and/or, with the target audio match to A few related audio；

Based on the target audio, the target text and at least one described related text, and/or, it is based on the target Text, the target audio and at least one described related audio, obtain corresponding relationship to be output；

7. according to the method described in claim 6, wherein, when meeting output condition, exporting the corresponding pass in a preset form System, comprising:

8. a kind of information processing unit, comprising:

Memory, for storing content to be output；

Processor, for playing the content to be output according to determining play mode；Wherein, the play mode includes at least First mode and second mode export the content pair to be output when playing the content to be output in the first mode The first audio answered；In the second mode, the second audio is exported when playing the content to be output, second audio is Handle the audio that first audio obtains.

9. device according to claim 8, wherein

The processor, is also used to: determining the target object in content to be output, wherein the target object need to be interpreted Object；Extract target audio corresponding with the target object and target text；What the determining and target text matched At least one related text, and/or, at least one related audio to match with the target audio；Based on the target sound Frequently, the target text and at least one described related text, and/or, the target text, the target audio and institute At least one related audio is stated, corresponding relationship to be output is obtained；

Described device further include:

10. a kind of information processing unit, comprising:

Determination unit, for determining content to be output；

Wherein, the play mode includes at least first mode and second mode,