CN106356065A

CN106356065A - Mobile terminal and voice conversion method

Info

Publication number: CN106356065A
Application number: CN201610969015.8A
Authority: CN
Inventors: 陈小翔; 张腾
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2016-10-31
Filing date: 2016-10-31
Publication date: 2017-01-25

Abstract

The invention provides a mobile terminal which comprises an analysis module and a conversion module, wherein the analysis module is used for obtaining the speaker voice and analyzing the language habits of the speaker according to the speaker voice and judging the language type of the speaker voice according to the language habits; and the conversion module is used for performing voice recognition of the speaker voice according to the judged language type and converting the speaker voice into text content. The invention also provides a voice conversion method. Through the mobile terminal and voice conversion method provided by the invention, the speaker language can be converted into corresponding text information, and complete text content corresponding to the speaker language is generated to facilitate user lookup.

Description

A kind of mobile terminal and phonetics transfer method

Technical field

The present invention relates to the communications field, more particularly, it relates to a kind of mobile terminal and phonetics transfer method.

Background technology

With the popularization with social software, increasing user carries out voice-enabled chat by social software, in language During chat, the voice of use is probably the various language such as dialect, mandarin, foreign language, is convenient for people to move in real time While communication, speech data also provides abundant data information simultaneously, can provide abundant information as research material, But, because the voice custom that talker uses is varied, it is right that the voice of talker can't be converted into by existing technology The word content answered.

Content of the invention

The invention provides a kind of mobile terminal, talker's language conversion can be become corresponding Word message, generate and say The corresponding complete word content of words person's language, facilitates user to consult.Described mobile terminal includes analysis module, modular converter.

Analysis module, for obtaining speaker speech, and analyzes the language of described talker according to described speaker speech Speech custom, judges the category of language of described speaker speech according to described language convention.

Modular converter, for speech recognition is carried out to described speaker speech according to the described category of language judged, will Described speaker speech is converted into word content.

Further, described analysis module includes accent recognition module, and described accent recognition module is used for being practised according to dialect It is used to described speaker speech is identified, judge the dialect species of described speaker speech；And

Described modular converter, is additionally operable to the dialect species according to judging and carries out speech recognition, by described speaker speech It is converted into word content.

Further, described analysis module also includes foreign language identification module, and described foreign language identification module is used for according to foreign language Custom is identified to described speaker speech, judges the foreign language species of described speaker speech；And

Described modular converter, is additionally operable to the foreign language species according to judging and carries out speech recognition, by described speaker speech It is converted into word content.

Further, described mobile terminal also includes:

Integrate module, for sequentially in time all voices of described talker being integrated into complete voice data, And sequentially in time the corresponding word content of all voices of described speaker is integrated into complete written historical materialss.

Further, described mobile terminal also includes:

Display module, for showing described complete written historical materialss；And

Playing module, for playing back described complete voice data.

The mobile terminal that the present invention provides analyzes the language kind of speaker speech by analyzing the language convention of talker Class, carries out targetedly identifying processing according to speech category, according to the dialect content of dialect category identification speaker speech, root According to the foreign language content of foreign language category identification speaker speech, and will identify that the Content Transformation coming becomes word content, can generate and The corresponding complete word content of talker's language, provides abundant Word message, facilitates user to consult, be conducive to user follow-up Carry out the editing and processing of correlation.

The present invention also provides a kind of phonetics transfer method, talker's language conversion can be become corresponding Word message, generates Complete word content corresponding with talker's language, facilitates user to consult.Described phonetics transfer method includes:

Acquisition speaker speech, and analyze the language convention of described talker according to described speaker speech, according to institute State the category of language that language convention judges described speaker speech；And

Speech recognition is carried out to described speaker speech according to the described category of language judged, by described speaker speech It is converted into word content.

Further, described phonetics transfer method also includes:

According to dialect custom, described speaker speech is identified, judges the dialect species of described speaker speech；And

Dialect species according to judging carries out speech recognition, and described speaker speech is converted into word content.

Further, described phonetics transfer method also includes:

According to foreign language custom, described speaker speech is identified, judges the foreign language species of described speaker speech；And

Foreign language species according to judging carries out speech recognition, and described speaker speech is converted into word content.

Further, described phonetics transfer method also includes:

Sequentially in time all voices of described talker are integrated into complete voice data；And

Sequentially in time the corresponding word content of all voices of described speaker is integrated into complete written historical materialss.

Further, described phonetics transfer method also includes:

Described complete written historical materialss are shown；And

Described complete voice data is played back.

The phonetics transfer method that the present invention provides analyzes the language of speaker speech by analyzing the language convention of talker Speech species, carries out targetedly identifying processing according to speech category, in the dialect according to dialect category identification speaker speech Hold, according to the foreign language content of foreign language category identification speaker speech, and will identify that the Content Transformation coming becomes word content, can give birth to Become complete word content corresponding with talker's language, abundant Word message is provided, facilitates user to consult, be conducive to user Subsequently carry out the editing and processing of correlation.

Brief description

Fig. 1 is the hardware architecture diagram of the mobile terminal realizing each embodiment of the present invention；

Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1；

Fig. 3 is the functional block diagram of the embodiment of the present invention one mobile terminal；

Fig. 4 is the functional block diagram of the embodiment of the present invention two mobile terminal；

Fig. 5 is the voice transition diagram of the embodiment of the present invention three mobile terminal；

Fig. 6 is the flow chart of the embodiment of the present invention four phonetics transfer method；

Fig. 7 is the flow chart of the embodiment of the present invention five phonetics transfer method.

The realization of the object of the invention, functional characteristics and advantage will be described further in conjunction with the embodiments referring to the drawings.

Specific embodiment

It should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.

Realize the mobile terminal of each embodiment of the present invention referring now to Description of Drawings.In follow-up description, use For represent element such as " module ", " part " or " unit " suffix only for being conducive to the explanation of the present invention, itself Not specific meaning.Therefore, " module " and " part " can mixedly use.

Mobile terminal can be implemented in a variety of manners.For example, the terminal described in the present invention can include such as moving Phone, smart phone, notebook computer, digit broadcasting receiver, pda (personal digital assistant), pad (panel computer), pmp The mobile terminal of (portable media player), guider etc. and such as numeral tv, desk computer etc. consolidate Determine terminal.Hereinafter it is assumed that terminal is mobile terminal.However, it will be understood by those skilled in the art that, except being used in particular for moving Outside the element of purpose, construction according to the embodiment of the present invention can also apply to the terminal of fixed type.

Fig. 1 is the hardware architecture diagram of the mobile terminal realizing each embodiment of the present invention.

Mobile terminal 10 can include, but not limited to memorizer 20, controller 30, wireless communication unit 40, input block 50th, input block 60, photographic head 70, mike 71, interface unit 80 and power subsystem 90.Fig. 1 shows with various assemblies Mobile terminal 10 it should be appreciated that being not required for implementing all assemblies illustrating.Can alternatively implement more or Less assembly.Will be discussed in more detail below the element of mobile terminal 10.

Wireless communication unit 40 generally includes one or more assemblies, its allow mobile terminal 10 and wireless communication system or Wireless points communication between network.For example, wireless communication unit can include broadcasting reception module, mobile communication module, wireless At least one of the Internet module, short range communication module and location information module.

Broadcasting reception module receives broadcast singal via broadcast channel from external broadcast management server and/or broadcast is related Information.Broadcast channel can include satellite channel and/or terrestrial channel.Broadcast management server can be to generate and send broadcast The server of signal and/or broadcast related information or receive before generate broadcast singal and/or broadcast related information and Send it to the server of terminal.Broadcast singal can include tv broadcast singal, radio signals, data broadcasting signal Etc..And, broadcast singal may further include the broadcast singal combining with tv or radio signals.The related letter of broadcast Breath can also provide via mobile communications network, and in this case, broadcast related information can be come by mobile communication module Receive.Broadcast singal can exist in a variety of manners, and for example, it can be referred to the electronic programming of DMB (dmb) The form of southern (epg), the electronic service guidebooks (esg) of digital video broadcast-handheld (dvb-h) etc. and exist.Broadcast reception mould Block can be broadcasted by using various types of broadcast system receipt signals.Especially, broadcasting reception module can be by using Such as multimedia broadcasting-ground (dmb-t), DMB-satellite (dmb-s), DVB-hand-held (dvb- H), forward link media (mediaflo^@) Radio Data System, the number of received terrestrial digital broadcasting integrated service (isdb-t) etc. Word broadcast system receives digital broadcasting.Broadcasting reception module may be constructed such that the various broadcast systems being adapted to provide for broadcast singal And above-mentioned digit broadcasting system.Via broadcasting reception module, the broadcast singal receiving and/or broadcast related information can store In memorizer 20 (or other types of storage medium).

Mobile communication module send radio signals to base station (for example, access point, node b etc.), exterior terminal with And at least one of server and/or receive from it radio signal.Such radio signal can include language call Signal, video calling signal or the various types of data sending and/or receiving according to text and/or Multimedia Message.

Wireless Internet module supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can internally or externally couple To terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can include wlan (wireless lan) (wi-fi), wibro (no Live width band), wimax (worldwide interoperability for microwave accesses), hsdpa (high-speed downlink packet access) etc..

Short range communication module is the module for supporting junction service.Some examples of short-range communication technology include bluetooth^tm、 RF identification (rfid), Infrared Data Association (irda), ultra broadband (uwb), purple honeybee^tmEtc..

Location information module be for check or obtain mobile terminal positional information module.The allusion quotation of location information module Type example is gps (global positioning system).According to current technology, gps module calculates the distance from three or more satellites Information and correct time information and the Information application triangulation for calculating, thus according to longitude, latitude and height Calculate three-dimensional current location information exactly.Currently, for calculate position and temporal information method use three satellites and The position calculating by using an other satellite correction and the error of temporal information.Additionally, gps module can be by real When ground Continuous plus current location information carry out calculating speed information.

Output unit 50 be configured to vision, audio frequency and/or tactile manner provide output signal (for example, audio signal, Video signal, alarm signal, vibration signal etc.).Output unit 50 can include display unit 51, dio Output Modules 52, Alarm unit 53 etc..

Display unit 51 may be displayed on the information processing in mobile terminal 10.For example, when mobile terminal 10 is in phone During call mode, display unit 51 can show and communicate with call or other (for example, text messaging, under multimedia file Carry etc.) related user interface (ui) or graphic user interface (gui).When mobile terminal 10 be in video calling pattern or During image capture mode, display unit 51 can show the image of capture and/or the image of reception, illustrate video or image and Ui or gui of correlation function etc..

Meanwhile, when display unit 51 and the touch pad touch screen with formation superposed on one another as a layer, display unit 51 Can serve as input equipment and output device.Display unit 51 can include liquid crystal display (lcd), thin film transistor (TFT) lcd (tft-lcd), at least in Organic Light Emitting Diode (oled) display, flexible display, three-dimensional (3d) display etc. Kind.Some in these display may be constructed such that transparence to allow user from outside viewing, and this is properly termed as transparent aobvious Show device, typical transparent display can be, for example, toled (transparent organic light emitting diode) display etc..Thought according to specific The embodiment wanted, mobile terminal 10 can include two or more display units (or other display device), for example, mobile whole End can include outernal display unit (not shown) and inner display unit (not shown).Touch screen can be used for detecting touch input Pressure and touch input position and touch input area.

Dio Output Modules 52 can be in call signal reception pattern, call mode, logging mode, language in mobile terminal When under speech recognition mode, the isotype such as broadcast reception mode, that wireless communication unit 40 is received or deposit in memorizer 20 Storage voice data transducing audio signal and be output as sound.And, dio Output Modules 52 can provide and mobile terminal The audio output (for example, call signal receives sound, message sink sound etc.) of the specific function correlation of 10 execution.Audio frequency is defeated Go out module 52 and can include speaker, buzzer etc..

Alarm unit 53 can provide output to notify event to mobile terminal 10.Typical event can be wrapped Include calling reception, message sink, key signals input, touch input etc..In addition to audio or video output, alarm unit 53 Output can be provided in a different manner with the generation of notification event.For example, alarm unit 53 can be provided in the form of vibrating Output, enters when communicating (incoming communication) when receiving calling, message or some other, alarm unit 53 Tactile output (that is, vibrating) can be provided to notify to user.By providing such tactile output, even if user's When mobile phone is in the pocket of user, user also can recognize that the generation of various events.Alarm unit 53 can also be through The output of the generation of notification event is provided by display unit 51 or dio Output Modules 52.

Input block 60 can generate key input data to control the various behaviour of mobile terminal according to the order of user input Make.Input block 60 allows the various types of information of user input, and can include keyboard, metal dome, touch pad (for example, Detection due to touched and lead to resistance, pressure, the change of electric capacity etc. sensitive component), roller, rocking bar etc..Especially Ground, when touch pad is superimposed upon on display unit 50 as a layer, can form touch screen.In an embodiment of the present invention, Described input block 60 includes touch screen and ink screen.Photographic head 70 is used for shooting image data, and mike 71 is used for enrolling sound Frequency data.

Interface unit 80 is connected, with mobile terminal 10, the interface that can pass through as at least one external device (ED).For example, outward Part device can include wired or wireless head-band earphone port, external power source (or battery charger) port, wired or wireless FPDP, memory card port, for connect have the port of device of identification module, audio input/output (i/o) port, Video i/o port, ear port etc..Identification module can be storage for verifying that user uses the various letters of mobile terminal 10 Cease and subscriber identification module (uim), client identification module (sim), Universal Subscriber identification module (usim) etc. can be included. In addition, the device (hereinafter referred to as " identifying device ") with identification module can take the form of smart card, therefore, identifying device Can be connected with mobile terminal 10 via port or other attachment means.Interface unit 80 can be used for receiving from external device (ED) Input (for example, data message, electric power etc.) and the input receiving is transferred to one or many in mobile terminal 10 Individual element or can be used for transmission data between mobile terminal and external device (ED).

In addition, when mobile terminal 10 is connected with external base, interface unit 80 can serve as allowing by it by electric power There is provided the path of mobile terminal 10 from base or can serve as allowing the various command signals from base input to pass by it The defeated path to mobile terminal.May serve as whether identifying mobile terminal from the various command signals of base input or electric power It is accurately fitted within the signal on base.

Memorizer 20 can store software program of the process being executed by controller 30 and control operation etc., or permissible Temporarily store oneself data (for example, telephone directory, message, still image, video etc.) through exporting or will export.And, Memorizer 20 can be to store the vibration of various modes and the data of audio signal with regard to exporting when touching and being applied to touch screen.

Memorizer 20 can include the storage medium of at least one type, and described storage medium includes flash memory, hard disk, many matchmakers Body card, card-type memorizer (for example, sd or dx memorizer etc.), random access storage device (ram), static random-access memory (sram), read only memory (rom), Electrically Erasable Read Only Memory (eeprom), programmable read only memory (prom), magnetic storage, disk, CD etc..And, mobile terminal 10 can execute memorizer 20 with by network connection Store function network storage device cooperation.

Controller 30 generally controls the overall operation of mobile terminal.For example, controller 30 execution and language call, data are led to The related control of letter, video calling etc. and process.In addition, controller 30 can be included for reproducing (or playback) multimedia number According to multi-media module, multi-media module can construct in controller 30, or it is so structured that separates with controller 30.Control The handwriting input executing on the touchscreen or picture can be drawn input and are identified as with execution pattern identifying processing by device 30 processed Character or image.

Power subsystem 90 receives external power or internal power under the control of the controller 30 and provides operation each element With the suitable electric power needed for assembly.

Various embodiment described herein can be with using such as computer software, hardware or its any combination of calculating Machine computer-readable recording medium is implementing.Hardware is implemented, embodiment described herein can be by using application-specific IC (asic), digital signal processor (dsp), digital signal processing device (dspd), programmable logic device (pld), scene can Program gate array (fpga), processor, controller, microcontroller, microprocessor, be designed to execute function described herein At least one in electronic unit implementing, in some cases, can be implemented in controller 180 by such embodiment. Software is implemented, the embodiment of such as process or function can with allow to execute the single of at least one function or operation Software module is implementing.Software code can be come by the software application (or program) write with any suitable programming language Implement, software code can be stored in memorizer 160 and be executed by controller 180.

So far, oneself is through describing mobile terminal according to its function.Below, for the sake of brevity, will describe such as folded form, Slide type mobile terminal in various types of mobile terminals of board-type, oscillating-type, slide type mobile terminal etc. is as showing Example.Therefore, the present invention can be applied to any kind of mobile terminal, and is not limited to slide type mobile terminal.

As shown in Figure 1 mobile terminal 10 may be constructed such that using such as wired via frame or packet transmission data To operate with wireless communication system and satellite-based communication system.

The communication system being wherein operable to according to the mobile terminal of the present invention referring now to Fig. 2 description.

Such communication system can use different air interfaces and/or physical layer.For example, used by communication system Air interface includes such as frequency division multiple access (fdma), time division multiple acess (tdma), CDMA (cdma) and universal mobile communications system System (umts) (especially, Long Term Evolution (lte)), global system for mobile communications (gsm) etc..As non-limiting example, under The description in face is related to cdma communication system, but such teaching is equally applicable to other types of system.

With reference to Fig. 2, cdma wireless communication system can include multiple mobile terminal 1s 0, multiple base station (bs) 270, base station control Device (bsc) 275 processed and mobile switching centre (msc) 280.Msc280 is configured to and Public Switched Telephony Network (pstn) 290 Form interface.Msc280 is also structured to and can form interface via the bsc275 that back haul link is couple to base station 270.Backhaul If circuit can construct according to any one in the interface that Ganji knows, described interface includes such as e1/t1, atm, ip, ppp, Frame relay, hdsl, adsl or xdsl.It will be appreciated that system as shown in Figure 2 can include multiple bsc2750.

Each bs270 can service one or more subregions (or region), by the sky of multidirectional antenna or sensing specific direction Each subregion that line covers is radially away from bs270.Or, each subregion can by for diversity reception two or more Antenna covers.Each bs270 may be constructed such that support multiple frequency distribution, and the distribution of each frequency has specific frequency spectrum (for example, 1.25mhz, 5mhz etc.).

Intersecting that subregion and frequency are distributed can be referred to as cdma channel.Bs270 can also be referred to as base station transceiver System (bts) or other equivalent terms.In this case, term " base station " can be used for broadly representing single Bsc275 and at least one bs270.Base station can also be referred to as " cellular station ".Or, each subregion of specific bs270 can be claimed For multiple cellular stations.

As shown in Figure 2, broadcast singal is sent to the mobile terminal of operation in system by broadcsting transmitter (bt) 295 10.Broadcasting reception module 111 is arranged on and is believed by the broadcast that bt295 sends with receiving at mobile terminal 10 as shown in Figure 1 Number.In fig. 2 it is shown that several global positioning system (gps) satellites 300.Satellite 300 helps position in multiple mobile terminal 1s 0 At least one.

In fig. 2, depict multiple satellites 300, it is understood that be, it is possible to use any number of satellite obtains useful Location information.Gps module 115 is generally configured to coordinate with satellite 300 to obtain the positioning letter wanted as shown in Figure 1 Breath.Substitute gps tracking technique or outside gps tracking technique, it is possible to use other of the position of mobile terminal can be followed the tracks of Technology.In addition, at least one gps satellite 300 can optionally or additionally process satellite dmb transmission.

As a typical operation of wireless communication system, bs270 receives the reverse link from various mobile terminal 1s 0 Signal.Mobile terminal 10 generally participates in call, information receiving and transmitting and other types of communication.Each of certain base station 270 reception is anti- Processed in specific bs270 to link signal.The data obtaining is forwarded to the bsc275 of correlation.Bsc provides call Resource allocation and the mobile management function of including the coordination of soft switching process between bs270.Bsc275 is also by the number receiving According to being routed to msc280, it provides the extra route service for forming interface with pstn290.Similarly, pstn290 with Msc280 forms interface, and msc and bsc275 form interface, and bsc275 correspondingly controls bs270 with by forward link signals It is sent to mobile terminal 10.

Based on above-mentioned mobile terminal hardware configuration and communication system, each embodiment of the inventive method is proposed.

Refer to Fig. 3, Fig. 3 is the functional block diagram of the embodiment of the present invention one mobile terminal.Mobile terminal 10 shown in Fig. 3 Including: analysis module 101, modular converter 103.Below each functional module is described in detail.Analysis module 101 obtains Speaker speech, and the language convention of talker is analyzed according to speaker speech, judge speaker speech according to the language habits Category of language, wherein, obtain speaker speech can by the microphone location of mobile terminal, Network Capture can also be passed through Speaker speech, language convention includes the personal habits of talker, dialect custom, foreign language custom, and personal habits include talker Commonly used interjection, auxiliary words of mood.Modular converter 103 carries out voice according to the category of language judged to speaker speech Identification, speaker speech is converted into word content.

The mobile terminal that the present embodiment provides analyzes the language of speaker speech by analyzing the language convention of talker Species, carries out targetedly identifying processing according to speech category, and will identify that the Content Transformation coming becomes word content, can give birth to Become complete word content corresponding with talker's language, abundant Word message is provided, facilitates user to consult, be conducive to user Subsequently carry out the editing and processing of correlation.

Refer to Fig. 4, Fig. 4 is the functional block diagram of the embodiment of the present invention two mobile terminal.Mobile terminal 10 shown in Fig. 4 Including: analysis module 101, modular converter 103, integration module 109, text importing module 111, playing module 113, analysis module 101 include accent recognition module 105, foreign language identification module 107.Below each functional module is described in detail.

Analysis module 101 acquisition speaker speech, and the language convention of talker is analyzed according to speaker speech, according to Language convention judges the category of language of speaker speech, and wherein, obtaining speaker speech can be by the mike of mobile terminal Record, speaker speech can also be downloaded by network.Modular converter 103 is according to the category of language judged to speaker speech Carry out speech recognition, speaker speech is converted into word content.

Further illustrate, speaker's voice has polytype, can be dialect, foreign language, mandarin, wherein, dialect Including dialect all over China, such as northeast words, Sichuan words, Hunan words, Cantonese etc., foreign language include English, French, German, The category of language such as Russian.After mobile terminal 10 obtains speaker speech, accent recognition module 105 is accustomed to talker according to dialect Voice is identified, and judges the dialect species of speaker speech.Modular converter 103 carries out voice according to the dialect species judged Identification, speaker speech is converted into word content.After mobile terminal 10 obtains speaker speech, foreign language identification module 107 According to foreign language custom, speaker speech is identified, judges the foreign language species of speaker speech.Modular converter 103 is according to judging Foreign language species carry out speech recognition, speaker speech is converted into word content.

When speaker's language is multiple brief voice, integrate module 109 sequentially in time that talker is all Voice is integrated into complete voice data, and is integrated into the corresponding word content of all voices of speaker sequentially in time Complete written historical materialss.After obtaining complete voice data, complete written historical materialss are shown by display module 111, just Consult in user, wherein, display module 111 includes display screen, touch screen, display screen includes tft LCDs, ufb liquid crystal Display screen, stn screen, active matrix organic light-emitting diode (AMOLED) panel, touch screen includes capacitive touch screen infrared-type and touches Screen, surface face ripple touch screen, mtk touch screen, touch screen receives touch signal and controls whether to show complete word content.Play Complete voice data is played back by module 113, and be easy to user's energy smoothness hears out all voice data, wherein, plays mould Block 113 also carries out denoising to complete voice data, the fluency of the complete speech data of speaker is adjusted, root Adjust suitable broadcast sound volume according to current broadcasting scene, after receiving the control signal of the complete speech playing speaker, Playing module 113 is by the speech play after processing out.

The mobile terminal that the present embodiment provides analyzes the language of speaker speech by analyzing the language convention of talker Species, carries out targetedly identifying processing according to speech category, according to the dialect content of dialect category identification speaker speech, According to the foreign language content of foreign language category identification speaker speech, and will identify that the Content Transformation coming becomes word content, can generate Complete word content corresponding with talker's language, provides abundant Word message, facilitates user to consult, after being conducive to user The continuous editing and processing carrying out correlation.

Refering to Fig. 5, Fig. 5 is the voice transition diagram of the embodiment of the present invention three mobile terminal.Voice in the present embodiment In transition diagram, the analysis module 101 of the mobile terminal 10 shown in the left side obtains speaker speech, and speaker speech includes a First voice of talker, the 3rd voice of a talker, second voice of b talker, the 4th voice of b talker, according to a, B speaker speech analyzes the language convention of a, b talker, and the language convention according to a, b talker judges a, b talker respectively The category of language of voice, wherein, obtain a, b speaker speech can by the microphone location of mobile terminal, can also pass through Network downloads speaker speech.Modular converter 103 carries out voice knowledge according to the category of language judged to a, b speaker speech Not, a, b speaker speech is converted into word content.

Further illustrate, speaker's voice has polytype, can be dialect, foreign language, mandarin etc., wherein, side Speech includes dialect all over China, such as northeast words, Sichuan words, Hunan words etc., and foreign language includes English, French, German, Russian etc. Category of language.After mobile terminal 10 obtains speaker speech, accent recognition module 105 is accustomed to a, b talker's language according to dialect Sound is identified, and judges the dialect species of a, b speaker speech.Modular converter 103 carries out language according to the dialect species judged Sound identifies, a, b speaker speech is converted into word content.After mobile terminal 10 obtains a, b speaker speech, foreign language identifies mould Block 107 is identified to a, b speaker speech according to foreign language custom, judges the foreign language species of a, b speaker speech.Modular converter 103 carry out speech recognition according to the foreign language species judged, a, b speaker speech is converted into word content.

Integrate module 109 and sequentially in time first voice of a talker, the 3rd voice are integrated into complete voice money Material, and sequentially in time the corresponding word content of all voices of speaker is integrated into complete written historical materialss, word provides Material includes the first word content, the 3rd word content, and the wherein first word content is corresponding with the content of the first voice, the 3rd literary composition Word content is corresponding with the content of the 3rd voice.Similarly, integrate module 109 sequentially in time by second language of b talker Sound, the 4th voice are integrated into complete voice data, and sequentially in time by the corresponding word of all voices of b speaker Hold and be integrated into complete written historical materialss, written historical materialss include the second word content, the 4th word content, the wherein second word content Corresponding with the content of the second voice, the 4th word content is corresponding with the content of the 4th voice, that is, in Figure 5 shown in right figure Content.

After obtaining complete voice data, the complete written historical materialss of a talker, b talker are shown by display module 111 Illustrate, be easy to user and consult, wherein, display module 111 includes display screen, touch screen, and display screen includes tft liquid crystal display Screen, ufb LCDs, stn screen, active matrix organic light-emitting diode (AMOLED) panel, touch screen can be red for capacitive touch screen Outer wire type touch screen, surface face ripple touch screen, mtk touch screen, touch screen receives touch signal and controls whether to show complete literary composition Word content.Complete voice data is played back by playing module 113, and be easy to user's energy smoothness hears out all voice data, Wherein, playing module 113 also carries out denoising to complete voice data, the fluency to the complete speech data of speaker It is adjusted, suitable broadcast sound volume is adjusted according to current broadcasting scene, when the complete speech receiving broadcasting speaker After control signal, playing module 113 is by the speech play after processing out.

The present invention also provides a kind of phonetics transfer method, the mobile terminal 10 shown in the method application Fig. 3 or Fig. 4, below The phonetics transfer method of the present embodiment is described in detail.

Refering to Fig. 6, Fig. 6 is the flow chart of the embodiment of the present invention four phonetics transfer method.

In step s601, analysis module 101 obtains speaker speech, and analyzes talker's according to speaker speech Language convention, judges the category of language of speaker speech according to the language habits, and wherein, obtaining speaker speech can be by moving The microphone location of terminal, speaker speech can also be downloaded by network.

In step s603, modular converter 103 carries out speech recognition according to the category of language judged to speaker speech, Speaker speech is converted into word content.

Further illustrate, speaker's voice has polytype, can be dialect, foreign language, mandarin etc., wherein, side Speech includes dialect all over China, such as northeast words, Sichuan words, Hunan words etc., and foreign language includes English, French, German, Russian etc. Category of language.After mobile terminal 10 obtains speaker speech, accent recognition module 105 is entered to speaker speech according to dialect custom Row identification, judges the dialect species of speaker speech.Modular converter 103 carries out speech recognition according to the dialect species judged, Speaker speech is converted into word content.After mobile terminal 10 obtains speaker speech, foreign language identification module 107 is according to foreign language Custom is identified to speaker speech, judges the foreign language species of speaker speech.Modular converter 103 is according to the foreign language judged Species carries out speech recognition, and speaker speech is converted into word content.

Refering to Fig. 7, Fig. 7 is the flow chart of the embodiment of the present invention five phonetics transfer method.

In step s701, when speaker's language is multiple brief voice, integrates module 109 and sequentially in time will All voices of talker are integrated into complete voice data.

In step s703, integrate module 109 sequentially in time by the corresponding word content of all voices of speaker It is integrated into complete written historical materialss.

In step s705, after obtaining complete voice data, complete written historical materialss are shown by display module 111 Come, be easy to user and consult, wherein, display module 111 includes display screen, touch screen, and display screen includes tft LCDs, ufb LCDs, stn screen, active matrix organic light-emitting diode (AMOLED) panel, touch screen includes capacitive touch screen infrared-type Touch screen, surface face ripple touch screen, mtk touch screen, touch screen receives touch signal and controls whether to show complete word content.

In step s707, complete voice data is played back by playing module 113, and being easy to user can smooth hearing out All voice data.Wherein, playing module 113 also carries out denoising to complete voice data, the complete language to speaker The fluency of sound data is adjusted, and adjusts suitable broadcast sound volume according to current broadcasting scene, plays speech when receiving After the control signal of the complete speech of person, playing module 113 is by the speech play after processing out.

Supplementary notes, the order of each step of the present embodiment can change, below with the analysis of mobile terminal 10 Module 101 obtain first voice of a talker, the 3rd voice of a talker, second voice of b talker, the of b talker Processing procedure after four voices is described in detail.The analysis module 101 of mobile terminal 10 analyzes according to a, b speaker speech The language convention of a, b talker, the language convention according to a, b talker judges the category of language of a, b speaker speech respectively, its In, obtain a, b speaker speech can by the microphone location of mobile terminal, can also by network download talker's language Sound.Modular converter 103 carries out speech recognition according to the category of language judged to a, b speaker speech, by a, b speaker speech It is converted into word content.

Further illustrate, speaker's voice has polytype, can be dialect, foreign language, mandarin etc..Mobile terminal After 10 obtain speaker speech, accent recognition module 105 is identified to a, b speaker speech according to dialect custom, judges a, b The dialect species of speaker speech.Modular converter 103 carries out speech recognition according to the dialect species judged, by a, b talker Voice is converted into word content.After mobile terminal 10 obtains a, b speaker speech, foreign language identification module 107 is accustomed to according to foreign language A, b speaker speech is identified, judges the foreign language species of a, b speaker speech.Modular converter 103 is outer according to judge Languages class carries out speech recognition, and a, b speaker speech is converted into word content.

Integrate module 109 and sequentially in time first voice of a talker, the 3rd voice are integrated into complete voice money Material, and sequentially in time the corresponding word content of all voices of speaker is integrated into complete written historical materialss, word provides Material includes the first word content, the 3rd word content, and the wherein first word content is corresponding with the content of the first voice, the 3rd literary composition Word content is corresponding with the content of the 3rd voice.Similarly, integrate module 109 sequentially in time by second language of b talker Sound, the 4th voice are integrated into complete voice data, and sequentially in time by the corresponding word of all voices of b speaker Hold and be integrated into complete written historical materialss, written historical materialss include the second word content, the 4th word content, the wherein second word content Corresponding with the content of the second voice, the 4th word content is corresponding with the content of the 4th voice.

After obtaining complete voice data, the complete written historical materialss of a talker, b talker are shown by display module 111 Illustrate, be easy to user and consult, wherein, display module 111 includes display screen, touch screen, and display screen includes tft liquid crystal display Screen, ufb LCDs, stn screen, active matrix organic light-emitting diode (AMOLED) panel, it is red that touch screen includes capacitive touch screen Outer wire type touch screen, surface face ripple touch screen, mtk touch screen, touch screen receives touch signal and controls whether to show complete literary composition Word content.Complete voice data is played back by playing module 113, and be easy to user's energy smoothness hears out all voice data, Wherein, playing module 113 also carries out denoising to complete voice data, the fluency to the complete speech data of speaker It is adjusted, suitable broadcast sound volume is adjusted according to current broadcasting scene, when the complete speech receiving broadcasting speaker After control signal, playing module 113 is by the speech play after processing out.

These are only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Any modification, equivalent and improvement of being made within principle etc., should be included within the scope of the present invention.

Claims

1. a kind of mobile terminal is it is characterised in that include:

Analysis module, for obtaining speaker speech, and practises according to the language that described speaker speech analyzes described talker Used, the category of language of described speaker speech is judged according to described language convention；And

Modular converter, for speech recognition is carried out to described speaker speech according to the described category of language judged, will be described Speaker speech is converted into word content.

2. mobile terminal as claimed in claim 1 is it is characterised in that described analysis module includes accent recognition module, described Accent recognition module is used for according to dialect custom, described speaker speech being identified, and judges the dialect of described speaker speech Species；And

Described modular converter, is additionally operable to the dialect species according to judging and carries out speech recognition, described speaker speech is changed Become word content.

3. mobile terminal as claimed in claim 1 is it is characterised in that described analysis module also includes foreign language identification module, institute State foreign language identification module for being identified to described speaker speech according to foreign language custom, judge the outer of described speaker speech Languages class；And

Described modular converter, is additionally operable to the foreign language species according to judging and carries out speech recognition, described speaker speech is changed Become word content.

4. the mobile terminal as described in as arbitrary in claim 1-3 one is it is characterised in that also include:

Integrate module, for sequentially in time all voices of described talker being integrated into complete voice data, and press According to time sequencing, the corresponding word content of all voices of described speaker is integrated into complete written historical materialss.

5. mobile terminal as claimed in claim 4 is it is characterised in that also include:

Playing module, for playing back described complete voice data.

6. a kind of phonetics transfer method is it is characterised in that include:

Acquisition speaker speech, and analyze the language convention of described talker according to described speaker speech, according to institute's predicate Speech custom judges the category of language of described speaker speech；And

Speech recognition is carried out to described speaker speech according to the described category of language judged, described speaker speech is changed Become word content.

7. phonetics transfer method as claimed in claim 6 is it is characterised in that also include:

8. phonetics transfer method as claimed in claim 6 is it is characterised in that also include:

9. the phonetics transfer method as described in as arbitrary in claim 6-8 one is it is characterised in that also include:

10. phonetics transfer method as claimed in claim 9 is it is characterised in that also include:

Described complete written historical materialss are shown；And

Described complete voice data is played back.