CN106302933A

CN106302933A - Voice information whose processing method and terminal

Info

Publication number: CN106302933A
Application number: CN201610797138.8A
Authority: CN
Inventors: 信哲鑫
Original assignee: Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Current assignee: Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2017-01-04
Anticipated expiration: 2036-08-31
Also published as: CN106302933B

Abstract

The embodiment of the invention discloses a kind of method that voice information whose processes, including: whether detection mobile terminal is in talking state；When being in talking state, gather call exhalation voice and the call incoming call voice corresponding with described call exhalation voice；By speech recognition detect described call exhalation voice whether with default phonetic feature and/or the first keyword match；The most then extract target text content according to described call incoming call voice and store.It addition, the embodiment of the present invention discloses a kind of voice information whose processing terminal the most accordingly.Use the present invention, it is possible to achieve Intelligent Recognition record important information carry out the purpose stored in communication process, reduce the workload of user, thus improve Consumer's Experience.

Description

Voice information whose processing method and terminal

Technical field

The present invention relates to communication technical field, particularly relate to a kind of voice information whose processing method and terminal.

Background technology

Current mobile terminal, when conversing, the most all can have incoming call display function, i.e. dial or answering electricity During words, mobile terminal screen can show number and the name of contact person of call, this function is applicable to information be stored Contact person in address list.And in some cases, when answering strangeness numbers incoming call, and may being unaware of the other side is whom, At this time can inquire in communication process, the other side can reply information such as " I are Xiao Ming "；And when conversing with acquaintance, meeting Run into the information needing record, such as " meeting place be 3rd floors meeting rooms ", " meeting thirty time eight " etc., and once user When being inconvenient at that time record, expect someone's call and there may be omission when recalling these information after hanging up again, thus can delay one The most important item, and operating process is loaded down with trivial details, the convenience of operation is not enough.

Summary of the invention

It is an object of the invention to provide a kind of voice information whose processing method, to realize Intelligent Recognition in communication process Record important information also stores, and improves Consumer's Experience.

For achieving the above object, one aspect of the present invention discloses a kind of voice information whose processing method, including following step Rapid:

Whether detection mobile terminal is in talking state；

When being in talking state, gather call exhalation voice and the call incoming call corresponding with described call exhalation voice Voice, described call exhalation voice is the voice information whose inputted by mike, and described call incoming call voice is described shifting The apparatus for remote communication that dynamic terminal receives is come by communication link, for the call of call exhalation voice described in response Voice messaging；

By speech recognition detect described call exhalation voice whether with default phonetic feature and/or the first keyword Join；

The most then extract target text content according to described call incoming call voice and store.

Alternatively, described by speech recognition detect described call exhalation voice whether with default phonetic feature and/or First keyword match includes:

Detect whether described call exhalation voice mates with described default phonetic feature by speech recognition, and/or, pass through After described call exhalation voice is converted into content of text by speech recognition, whether detect the content of text of described conversion with described Preset the first keyword match.

Alternatively, described extract target text content according to described call incoming call voice and include:

The effective information length set according to user, by speech recognition by described in described effective information length range Call incoming call voice is converted into described target text content, and described effective information length includes time span or statement number.

By speech recognition, described call incoming call voice is converted into raw text content, extracts described raw text content In with preset the second keyword match part as described target text content.

Alternatively, generate corresponding associated person information according to described target text content, and described associated person information is added Add in address list.

It is a further object of the present invention to provide a kind of voice information whose processing terminal, to realize intelligence in communication process Identification record important information also stores, and improves Consumer's Experience.

For achieving the above object, another aspect of the present invention additionally provides a kind of mobile terminal, including:

Talking state detection module, is used for detecting whether mobile terminal is in talking state；

Voice acquisition module, for gathering the call exhalation voice of described mobile terminal and breathing out voice pair with described call The call incoming call voice answered, described call exhalation voice is the voice information whose inputted by mike, described call incoming call Voice is that the apparatus for remote communication that described mobile terminal receives is come by communication link, exhales for call described in response Go out the voice information whose of voice；

Information detecting module, for detecting whether described call exhalation voice closes with default phonetic feature and/or first Key word mates；

Information storage module, for extracting target text content according to described call incoming call voice and store.

Alternatively, whether described information detecting module is for detecting described call exhalation voice with described by speech recognition Preset phonetic feature coupling, and/or, after described call exhalation voice being converted into content of text by speech recognition, detection Whether the content of text of described conversion presets the first keyword match with described.

Alternatively, described information storage module is additionally operable to the effective information length set according to user, passes through speech recognition Described call incoming call voice in described effective information length range is converted into described target text content, described effective information Length includes time span or statement number.

Alternatively, described information storage module is additionally operable to be converted into original by speech recognition by described call incoming call voice Content of text, extracts in described raw text content and presets the part of the second keyword match as in described target text Hold.

Alternatively, described information storage module is additionally operable to generate corresponding contact person letter according to described target text content Breath, and described associated person information is added to address list.

Implement the embodiment of the present invention, will have the advantages that

Mobile terminal is in communication process, by gathering call exhalation voice and corresponding call incoming call voice, Use speech recognition technology that call exhalation voice is mated with the keyword preset, extract required in call incoming call voice The information wanted also preserves.Owing to terminal can record the key message during user's communication automatically, and these are believed Breath preserves, and reduces information editing's dependency degree to user, and therefore, said method and device can improve user operation just Profit；Meanwhile, in communication process, intelligence record important information can also be avoided because being difficult to recall after user's inconvenience record, on-hook The item that causes of reason such as play to omit, thus reduce the workload of user, improve Consumer's Experience.

Accompanying drawing explanation

For the technical scheme being illustrated more clearly that in the embodiment of the present invention, below by use required in embodiment Accompanying drawing is briefly described, it should be apparent that, the accompanying drawing in describing below is only some embodiments of the present invention, for ability From the point of view of the those of ordinary skill of territory, on the premise of not paying creative work, it is also possible to obtain the attached of other according to these accompanying drawings Figure.

Fig. 1 is the flow chart of a kind of voice information whose processing method that the embodiment of the present invention provides；

Fig. 2 is the structure chart of a kind of voice information whose processing terminal that the embodiment of the present invention provides；

Fig. 3 is a kind of computer system running above-mentioned voice information whose processing method that the embodiment of the present invention provides Hardware structure figure.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on this Embodiment in invention, the every other reality that those of ordinary skill in the art are obtained under not making creative work premise Execute example, broadly fall into the scope of protection of the invention.

Automatically identify and user is depended on by intelligence record important information to reduce information editing to be able to realize call voice Lai Xing, and improve Consumer's Experience, the present invention proposes method and the terminal that a kind of voice information whose processes, wherein mentioned The execution of method depends on computer program, can run on the computer system of von Neumann system.This computer program Can be integrated in address list, it is possible to run as independent tool-class application.This computer system can be mobile phone, panel computer Deng terminal unit.

In order to make those skilled in the art be better understood from the present invention program, below in conjunction with the accompanying drawings and detailed description of the invention The present invention is described in further detail.

With reference to Fig. 1, the workflow of the embodiment of the method that voice information whose of the present invention processes comprises the following steps:

Step S102, whether detection mobile terminal is in talking state.

When user receives calls or calls to other users, the mobile phone of user enters talking state.In this reality Executing in example, mobile terminal can judge this shifting by the operation that the detection operation that receive calls of user or user are called Dynamic terminal enters talking state, when mobile terminal detects that user hangs up the telephone or the other side hangs up the telephone, then judges this shifting Dynamic terminal exits talking state.

Step S104, gathers call exhalation voice and the call incoming call voice corresponding with call exhalation voice.

Call exhalation voice is the voice information whose that this mobile terminal holder is inputted by mike.

Call incoming call voice is the apparatus for remote communication that mobile terminal in communication process receives and is passed by communication link Defeated, for the voice information whose of call exhalation voice described in response.

In communication process, it is generally the case that all use the mode of dialogue to converse, therefore, user between user When first and user's second are conversed, the call incoming call voice corresponding with the call exhalation voice of user's first is mobile terminal and exists After the once call exhalation voice that user's first is inputted by mike being detected, user's second of receiving is set by remote communication The call incoming call voice that standby transmission comes.

Such as, in one scenario, call exhalation voice and call incoming call voice, presented in " question-response ", are used Family first and user's second are conversed, and one section of dialogue therein is as follows:

First is asked: " who are you "？

Second is answered: " I is Xiao Fang ".

First is asked: " where the address could you tell me is "？

Second is answered: " I am on garden road 15 ".

So in such one section of call, first asks that answering one with second for one can serve as a voice messaging group, its In, it is assumed that first is mobile terminal holder, then for the mobile terminal that first is held, and first is exactly that call is exhaled if saying Going out voice, i.e. " who are you？", " who are you with call exhalation voice？" corresponding call incoming call voice is " I is Xiao Fang ".With Sample, the call incoming call voice corresponding with call exhalation voice " where the address could you tell me is " is for " I am on garden road 15 Number ".

Step S106, by speech recognition detect described call exhalation voice whether with default phonetic feature and/or the One keyword match；If coupling, then perform step S108, extract target text content according to call incoming call voice and deposit Storage；If not mating, then performing step S104, continuing to gather next call exhalation voice.

Speech recognition is with voice as object of study, by Speech processing and pattern recognition allow machine automatically identify and Understanding the language of human oral, speech recognition technology allows machine with understanding process, voice signal is changed into phase by identifying exactly The text answered or the high-tech of order.Speech recognition is one relates to the cross discipline that face is the widest, it and acoustics, phonology, language The subjects such as Yan Xue, information theory, pattern recognition theory and neurobiology have very close relationship, the most progressively become meter Calculate the key technology in machine information treatment technology.

Unknown voice is added in the input of identification system after microphone is transformed into the signal of telecommunication, first passes around pretreatment. again Characteristic voice according to people sets up speech model, is analyzed the voice signal of input, and the feature needed for extraction, at this base The template needed for speech recognition is set up on plinth.And computer will be according to the model of speech recognition during identifying, by computer In the feature of the sound template deposited and the voice signal of input compare, according to certain search and matching strategy, find out The template with input voice match of a series of optimums, then according to the definition of this template, just can provide calculating by tabling look-up The recognition result of machine.

In the present embodiment, when call exhalation voice is carried out speech recognition, not only mandarin can be carried out speech recognition, For multiple dialect, multi-lingual all can be identified.In this scheme, can be by the sound template of mobile terminal Add different language form, such as mandarin, Guangdong language, the south of Fujian Province language etc., and the language of the different country origin such as English, German, pass through The coupling of lexicon realizes speech recognition.

After being identified result, the phonetic feature that the voice messaging that mobile terminal can will identify that is direct with default Mate.In the present embodiment, it is determined that the mode that the voice messaging identified mates with default phonetic feature has multiple, example As, can be when voice messaging be identical with default phonetic feature, it is determined that the two coupling, in the most above-mentioned dialogue, the call of first is exhaled Go out voice " who are you ", it is assumed that in default phonetic feature, there is " who are you " this phonetic feature, then when mobile whole End is when retrieving this phonetic feature from template base, i.e. thinks identified voice messaging and the phonetic feature preset Join；Furthermore it is also possible to judged by the sound bite that compatible portion is crucial, such as, first in above-mentioned dialogue is led to Words exhalation voice " where the address could you tell me is ", it is assumed that be provided with in default phonetic feature " your address " or " Where " etc. sound bite is as matching template, then mobile terminal the two sheet in recognizing above-mentioned call exhalation voice Duan Shi, it is also possible to judge that the call exhalation voice identified matches with the phonetic feature preset.

In another embodiment, after being identified result, recognition result can also be converted into text by mobile terminal Content, mates with the first default keyword the most again.Still as a example by above-mentioned dialogue, the mobile terminal that user's first is held Under being translated into Word message after the call collecting first exports voice " where the address could you tell me is " and preserving Come.

In the present embodiment, can use content of text that various ways judges to convert whether with the first default keyword Coupling.Such as, can convert content of text with preset the first keyword identical time, it is determined that the two mate, as with " Location " two words go to mate above-mentioned call exhalation voice, and can the match is successful；It addition, can convert content of text with preset When the first keyword synonym each other or near synonym, it is determined that the two coupling, such as the semanteme such as " where staying in ", " place ", " position " Similar word, it is also possible to think that the match is successful, in the manner, can pre-set thesaurus or near synonym storehouse, then basis The first keyword that thesaurus or the extension of near synonym storehouse are preset.

In another embodiment, also by calculating the content of text of conversion and the similar of the first default keyword Degree, and compare similarity and judge whether to mate with the size of threshold value.Such as, can be by the content of text converted and the preset The ratio that one keyword comprises identical characters is to calculate similarity: when having 7 in 10 characters of the first content of text with default The first keyword identical time, then similarity is 70%, if threshold value is set to 50%, then judge the two the match is successful.

When call exhalation voice being detected with the first keyword match preset, terminal begins to identify call incoming call language Content in sound.Same, by aforesaid speech recognition algorithm, call incoming call voice can be identified, obtain target text Content.

Further, when incoming call voice of conversing is long, may wherein only some includes effect information, and most of Information broadly fall into redundancy, such as according to general call custom, most people former the meetings made a phone call mention from The information such as oneself name, Business Name, if all call incoming call voices are changed and processed, will increase the work of terminal Measure, take more memory space.In the present embodiment, in order to avoid taking storage resource because voice messaging is long, can With arrange an effective information threshold value to call incoming call voice delete process.

This effective information threshold value can be time span.This time span can be one section of set time T, is i.e. adopting After collection arrives call incoming call voice, the call incoming call voice intercepted in set time T carries out speech recognition, as intercepted call incoming call Within first 15 seconds of voice, process, one section of word as such: " you are good, the engineer of my Shi great Chuan scientific & technical corporation, and I is Li Lei, you The scheme before submitted to also some may need the place discussed, and you see when facilitate？", it is assumed that wherein comprise Effective information has been included in 10 seconds for " great Chuan scientific & technical corporation ", " Li Lei ", then can set intercepting call and exhale Within first 10 seconds that enter voice, carry out voice recognition processing；Or, set the certain percentage of intercepting total call incoming call voice duration Time span carries out speech recognition, as assume this section of words " you are good, the engineer of my Shi great Chuan scientific & technical corporation, and I is Li Lei, you it The scheme that front submission comes also some may need the place discussed, and you see when facilitate？" total time span is 100 Second, the percentage ratio set is as 15%, then the call incoming call voice intercepted in 15 seconds therein processes.

Accordingly, this effective information threshold value can also is that statement number.When gathering call incoming call voice, by setting One dead time threshold value t not having during phonetic entry, when the dead time between two words is more than t, then it is assumed that when this section Between call input voice before and after interval belong to different sentence, thus calculate the statement number of gathered call incoming call voice Mesh.Equally as a example by the preceding paragraph is talked about, it is assumed that dead time threshold value is 0.5 second, thus distinguish different sentence: first is " you Good ", second is " engineer of my Shi great Chuan scientific & technical corporation ", and the 3rd is " I is Li Lei ", and the 4th is " to submit to before you The scheme come over may go back some place needing to discuss ", the 5th is " you see when facilitate ", then obtained is logical The statement number of words input voice is five.

After obtaining the statement number of call input voice, a fixed sentence number N can be set and come call incoming call language Sound is deleted, the front N sentence i.e. intercepted in total call incoming call voice processes.As arranged N=3, same for " you Getting well, the engineer of my Shi great Chuan scientific & technical corporation, I is Li Lei, and you submit to the scheme come also some may need discussion before Place, you see when facilitate？", three words being intercepted out be " you are good, the engineer of my Shi great Chuan scientific & technical corporation, I It is Li Lei ", these three words carry out speech recognition afterwards.Or, it is also possible to set the required statement number intercepted and account for total leading to Call incoming call voice is intercepted by the percentage ratio of words incoming call voice, and equally as a example by talking about the last period, total sentence number is five, Set and intercept therein 60%, i.e. first three sentence is carried out speech recognition, so can greatly reduce workload.

Further, in another embodiment, it is also possible to by deleting logical with the second default keyword match Words incoming call voice carries out the redundant data in the target text content that speech recognition obtains.Such as, in one section of voice messaging Between be also possible to there will be the important information of part, in such one section of word, " weather of today is relatively good, and have a meal in the afternoon together , come to visit it addition, company's tomorrow has client, the morning nine 3rd floors meeting rooms of Dian we first have meeting and the most how discuss Arrange.In ", one has five words, and wherein important information " tomorrow ", " point in the morning nine ", " 3rd floors meeting rooms ", " having a meeting " are in In rear two words, for this situation, after the voice messaging of whole section all can being identified and preserves, then carry out Keyword match, identifies " tomorrow ", " point in the morning nine ", " 3rd floors meeting rooms ", " hold a meeting " preserve, thus avoid information Omission.

The target text content that speech recognition obtains can be automatically processed by mobile terminal and preserve.Such as, incite somebody to action " I Address is garden road 15 " whole sentence all preserves.Can also be accustomed to by the editor that counting user is conventional, delete intelligently Preserve again after some otiose words.Same as a example by " my address is garden road 15 ", useful information therein It is " garden road 15 ", can be after by identifying the second keyword " my address ", by garbage " my address is " These five word deletions, the most only preserve " garden road 15 " this information.

It is available for user edits the interface of target text content, by user to mesh it addition, mobile terminal can also arrange one Mark content of text carries out preserving after editor arranges again.Such as, this section is talked about: " you are good, the engineering of my Shi great Chuan scientific & technical corporation Teacher, I am Li Lei, and you submit to the scheme come also some may need the place discussed before, and you see when facilitate？", The target text content that possible terminal extracts after automatically identifying is " engineer of great Chuan scientific & technical corporation ", " I is Li Lei ", that Some of which irrelevant information such as " engineer of company ", " I cries " can be deleted by user, preserve " creating greatly science and technology ", " Lee Thunder " etc. information as target text content.

In the present embodiment, target text content can be stored in memorandum.For example, it is possible to will know mentioned by voice The important information " point in the morning nine " that do not obtains, " 3rd floors meeting rooms " etc. are saved in memorandum, arrange alarm clock calling, to prevent mistake Cross material particular.

In another embodiment, it is also possible to generate associated person information according to target text content, be then added to communication In record.Such as, for " I is Xiao Fang ", " I am on garden road 15 ", can be by name therein " Xiao Fang " and address information " flower Garden road 15 " it is stored in address list, as the remark information of contact person for later reference.

In addition, if conversation object is the contact person existed in local address book, can be by associated person information It is added in the business card of respective contacts.

If conversation object is strangeness numbers, then when adding in address list, can by obtain name information with Caller ID preserves together, sets up new contact person's business card.

In one embodiment, the execution process of the present invention is stated below in conjunction with a concrete application scenarios, at this In application scenarios, the mobile terminal that the mobile terminal that Mrs Alan holds is held to card Mr. Michael initiates call request, After setting up call connection, the mobile terminal that Mrs Alan holds i.e. detects entrance talking state, and following call detected Exhalation voice and call incoming call voice:

Mrs Alan: " card Mr. Michael, you are good, and I is Alan, and I thinks and your a fixed interview time, and you are what time Wait free "？

Card Mr. Michael: " it is so, Mrs Alan, hello, and I may have no time to see you in this week, we How about 11:30 in morning next Tuesday just it is scheduled on "？

In this scene, the mobile terminal that Mrs Alan holds detect call exhalation voice in " time ", " time Wait ", the word such as " have time " and the first keyword match (lookup from the key word library of bag time dependent class), then Mrs Alan Corresponding call incoming call voice i.e. can be processed by the mobile terminal held, namely to this of card Michael replied Mr. Section words carry out speech recognition, the mobile terminal that Mrs Alan holds by " Mrs Alan ", " week ", " time " that detect, " under The morning Tuesday 11:30 " etc. be all saved in memorandum, after end of conversation, the memorandum of Mrs Alan can preserve these Word, needs to delete the garbages such as " Mrs Alan ", " week ", " time " through arranging, final only preservation " morning next Tuesday 11:30 ", it is also possible to manually added " interview time " as remark information by Mrs Alan.

For achieving the above object, another aspect of the present invention additionally provides a kind of mobile terminal, as in figure 2 it is shown, described call Speech signal analysis terminal includes:

Talking state detection module 102, is used for detecting whether mobile terminal enters talking state.

Voice acquisition module 104, for gathering the call exhalation voice of described mobile terminal and breathing out language with described call The call incoming call voice that sound is corresponding, described call exhalation voice is the voice information whose inputted by mike, described call Incoming call voice is that the apparatus for remote communication that described mobile terminal receives is come by communication link, for logical described in response The voice information whose of words exhalation voice.

Information detecting module 106, for detect described call exhalation voice whether with default phonetic feature and/or first Keyword match.

Information storage module 108, for extracting target text content according to described call incoming call voice and store.

Alternatively, described information detecting module 106 for by speech recognition detect described call exhalation voice whether with Described default phonetic feature mates, and/or, after described call exhalation voice being converted into content of text by speech recognition, Whether the content of text detecting described conversion presets the first keyword match with described.

Alternatively, described information storage module 108 is additionally operable to the effective information length set according to user, is known by voice The described call incoming call voice in described effective information length range is not converted into described target text content, described effective letter Breath length includes time span or statement number.

Alternatively, described information storage module 108 is additionally operable to be converted into by described call incoming call voice by speech recognition Raw text content, extracts in described raw text content and presets the part of the second keyword match as described target text Content.

Alternatively, described information storage module 108 is additionally operable to generate corresponding contact person according to described target text content Information, and described associated person information is added to address list.

Implementing the present invention, mobile terminal is in communication process, by gathering call exhalation voice and corresponding logical Words incoming call voice, uses speech recognition technology to mate call exhalation voice with the keyword preset, at call incoming call language Sound extracts required information and preserves.Owing to terminal can record the crucial letter during user's communication automatically Breath, and these information are preserved, reduce information editing's dependency degree to user, simplify the operating process of user, improve and use The convenience of family operation；Meanwhile, in communication process, intelligence record important information can also be avoided because of user's inconvenience record, on-hook After the item that reason causes such as be difficult to recall and omit, thus reduce the workload of user, improve Consumer's Experience.

In one embodiment, as it is shown on figure 3, Fig. 3 illustrates and a kind of runs above-mentioned voice information whose processing method The terminal 10 of computer system based on von Neumann system.This computer system can be that smart mobile phone, panel computer etc. are whole End equipment.Concrete, it may include the outer input interface 1001 that connected by system bus, processor 1002, memorizer 1003 With output interface 1004.Wherein, outer input interface 1001 optionally can at least include network interface 10012.Memorizer 1003 External memory 10032 (such as hard disk, CD or floppy disk etc.) and built-in storage 10034 can be included.Output interface 1004 can be at least Including display screen 10042 equipment such as grade, and described processor 1002 is additionally operable to perform above-mentioned voice information whose processing method, bag Include:

Whether detection mobile terminal is in talking state；

When being in talking state, gather call exhalation voice and the call incoming call language corresponding with described call exhalation voice Sound, described call exhalation voice is the voice information whose inputted by mike, and described call incoming call voice is described movement The apparatus for remote communication that terminal receives is come by communication link, for the call language of call exhalation voice described in response Message ceases；

In the present embodiment, the operation of this method is stored in based on computer program, the program file of this computer program In the external memory 10032 of aforementioned computer system based on von Neumann system 10, operationally it is loaded into built-in storage In 10034, it is transferred to after being then compiled as machine code in processor 1002 perform, so that based on von Neumann system Computer system 10 in form talking state detection module 102 in logic, voice acquisition module 104, text identification module 106, keyword search module 108, information storage module 110.And during above-mentioned voice information whose processing method performs, The parameter of input is all received by outer input interface 1001, and is transferred in memorizer 1003 caching, is then input to process Processing in device 1002, the result data of process or be cached in memorizer 1003 subsequently processes, or is passed to Output interface 1004 exports.

Claims

1. the method that a voice information whose processes, it is characterised in that described method includes:

Whether detection mobile terminal is in talking state；

When being in talking state, gather call exhalation voice and the call incoming call voice corresponding with described call exhalation voice, Described call exhalation voice is the voice information whose inputted by mike, and described call incoming call voice is described mobile terminal The apparatus for remote communication received is come by communication link, for the call voice letter of call exhalation voice described in response Breath；

By speech recognition detect described call exhalation voice whether with default phonetic feature and/or the first keyword match；

2. the method that voice information whose as claimed in claim 1 processes, it is characterised in that

Described by speech recognition detect described call exhalation voice whether with default phonetic feature and/or the first keyword Join and include:

Detect whether described call exhalation voice mates with described default phonetic feature by speech recognition, and/or, pass through voice Identifying after described call exhalation voice is converted into content of text, whether the content of text detecting described conversion is preset with described First keyword match.

3. the method that voice information whose as claimed in claim 1 processes, it is characterised in that described according to described call incoming call Voice extracts target text content and includes:

The effective information length set according to user, by speech recognition by the described call in described effective information length range Incoming call voice is converted into described target text content, and described effective information length includes time span or statement number.

4. the method that voice information whose as claimed in claim 1 processes, it is characterised in that described according to described call incoming call Voice extracts target text content and includes:

By speech recognition, described call incoming call voice is converted into raw text content, extract in described raw text content with Preset the part of the second keyword match as described target text content.

5. the method that voice information whose as claimed in claim 1 processes, it is characterised in that according to described target text content Generate corresponding associated person information, and described associated person information is added to address list.

6. the terminal that a voice information whose processes, it is characterised in that including:

Voice acquisition module, is used for gathering the call exhalation voice of described mobile terminal and breathes out voice with described call corresponding Call incoming call voice, described call exhalation voice is the voice information whose inputted by mike, described call incoming call voice The apparatus for remote communication received for described mobile terminal is come by communication link, for call exhalation language described in response The voice information whose of sound；

Information detecting module, for detect described call exhalation voice whether with default phonetic feature and/or the first keyword Coupling；

7. the terminal that voice information whose as claimed in claim 6 processes, it is characterised in that described information detecting module is used for Detect whether described call exhalation voice mates with described default phonetic feature by speech recognition, and/or, pass through speech recognition After described call exhalation voice is converted into content of text, whether the content of text detecting described conversion presets first with described Keyword match.

8. the terminal that voice information whose as claimed in claim 6 processes, it is characterised in that described information storage module is also used In the effective information length set according to user, by speech recognition, the described call in described effective information length range is exhaled Entering voice and be converted into described target text content, described effective information length includes time span or statement number.

9. the terminal that voice information whose as claimed in claim 6 processes, it is characterised in that described information storage module is also used In described call incoming call voice being converted into raw text content by speech recognition, extract in described raw text content with institute State the part of default second keyword match as described target text content.

10. the terminal that voice information whose as claimed in claim 6 processes, it is characterised in that described information storage module is also For generating corresponding associated person information according to described target text content, and described associated person information is added to address list In.