CN105208194A

CN105208194A - Voice broadcast device and method

Info

Publication number: CN105208194A
Application number: CN201510504890.4A
Authority: CN
Inventors: 张圣杰; 孙丽
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2015-08-17
Filing date: 2015-08-17
Publication date: 2015-12-30

Abstract

The invention discloses a voice broadcast device. The voice broadcast device comprises an extraction module used for sampling sound samples and extracting voiceprint characteristics from the sound samples, an association module used for establishing mapping relations between the voiceprint characteristics and corresponding contact persons, a broadcasting module used for adding the contact persons to a selection list as voice roles so that users can select the voice roles according to the selection list to carry out voice broadcast according to the voiceprint characteristics corresponding to the voice roles. The invention further provides a voice broadcast method. Through the device and the method, more voice roles are provided for user selection, the voice roles for voice broadcast has more selection possibility, and user experience is improved.

Description

Sound broadcasting device and method

Technical field

The present invention relates to phonetic synthesis field, particularly relate to a kind of sound broadcasting device and method.

Background technology

At present, along with the fast development of the communication technology and terminal technology, increasing terminal provides voice broadcast function.Traditional is doing voice broadcast for TTS (TextToSpeech, from Text To Speech) when reporting, the speech roles of voice broadcast is built-in one or several good specific speech roles before mobile terminal dispatches from the factory, and user cannot increase more sounding object.The vocal print feature corresponding due to the speech roles for voice broadcast is built-in before dispatching from the factory, and the selection of the speech roles of the voice broadcast that terminal is provided is very limited, inadequate hommization.Therefore, the speech roles that vocal print feature preset in existing voice broadcast process provides is fixed and limited amount, cannot add the problem that speech roles is selected for user.Problems demand inventor in this respect solves.

Foregoing, only for auxiliary understanding technical scheme of the present invention, does not represent and admits that foregoing is prior art.

Summary of the invention

The speech roles that main purpose of the present invention is to solve vocal print feature preset in existing voice broadcast process and provides is fixed and limited amount, cannot add the problem that speech roles is selected for user.

For achieving the above object, a kind of sound broadcasting device provided by the invention, described sound broadcasting device comprises:

Extraction module, for gathering sample sound, and extracts vocal print feature from described sample sound;

Relating module, for setting up the mapping relations of described vocal print feature and corresponding relationship people;

Report module, for described contact person is added into selective listing as speech roles, selects speech roles for user based on described selective listing, carry out voice broadcast with the vocal print feature corresponding according to described speech roles.

Preferably, described report module comprises adding device, selected cell, determining unit and reports unit;

Described adding device, for being added into described selective listing using described contact person as speech roles;

Described selected cell, selecting selective listing described in interface display for providing, selecting speech roles for user based on described selective listing;

Described determining unit, for when receiving user and completing instruction based on the selection that described selection interface is triggered, determines the vocal print feature of selected speech roles and correspondence;

Described report unit, carries out voice broadcast for the vocal print feature corresponding according to described speech roles.

Preferably, described report unit, also waits to report text for determining, and waits the RP reporting text described in synthesis;

Described report unit, also for modifying described RP according to described vocal print feature, obtains the sound waveform of described contact person's pronunciation character;

Described report unit, also carries out voice broadcast for exporting described sound waveform.

Preferably, described relating module comprises setting unit and builds receipts or other documents in duplicate unit;

Described setting unit, arranges interface for providing, and arranges contact person corresponding to described vocal print feature for user based on the described interface that arranges;

Describedly build receipts or other documents in duplicate unit, for receive user based on described arrange that interface triggers be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

Preferably, described extraction module comprises copied cells and extraction unit;

Described copied cells, for when detecting the call with contact person, copies the voice data of received described contact person;

Described copied cells, also for copying the copy of the voice data obtained as sample sound;

Described extraction unit, for extracting vocal print feature from described sample sound.

In addition, for achieving the above object, the present invention also provides a kind of voice broadcast method, and described voice broadcast method comprises the following steps:

Gather sample sound, and extract vocal print feature from described sample sound;

Set up the mapping relations of described vocal print feature and corresponding relationship people;

Described contact person is added into selective listing as speech roles, selects speech roles for user based on described selective listing, carry out voice broadcast with the vocal print feature corresponding according to described speech roles.

Preferably, described described contact person is added into selective listing as speech roles, select speech roles for user based on described selective listing, the step of carrying out voice broadcast with the vocal print feature corresponding according to described speech roles comprises:

Described contact person is added into described selective listing as speech roles;

There is provided and select selective listing described in interface display, select speech roles for user based on described selective listing;

When receiving user and completing instruction based on the selection that described selection interface is triggered, determine the vocal print feature of selected speech roles and correspondence;

The vocal print feature corresponding according to described speech roles carries out voice broadcast.

Preferably, the step that the described vocal print feature corresponding according to described speech roles carries out voice broadcast comprises:

Determine to wait to report text, and described in synthesis, wait the RP reporting text;

According to described vocal print feature, described RP is modified, obtain the sound waveform of described contact person's pronunciation character;

Export described sound waveform and carry out voice broadcast.

Preferably, the described step setting up the mapping relations of described vocal print feature and corresponding relationship people comprises:

There is provided and interface is set, for user, contact person corresponding to described vocal print feature is set based on the described interface that arranges;

Receive user based on described arrange interface trigger be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

Preferably, described collection sample sound, and the step extracting vocal print feature from described sample sound comprises:

When detecting the call with contact person, copy the voice data of received described contact person;

To the copy of the voice data obtained be copied as sample sound;

Vocal print feature is extracted from described sample sound.

The present invention is by extracting vocal print feature in the sample sound of collection automatically, and set up the mapping relations of described vocal print feature and contact person, the contact person of correspondence is added into selective listing as speech roles, select speech roles for user based on described selective listing, carry out voice broadcast with the vocal print feature corresponding according to described speech roles.There is provided more speech roles to select for user, make the speech roles of voice broadcast have more selectivity, improve Consumer's Experience.

Accompanying drawing explanation

Fig. 1 is the hardware configuration signal of the mobile terminal realizing each embodiment of the present invention;

Fig. 2 is the wireless communication system schematic diagram of mobile terminal as shown in Figure 1;

Fig. 3 is the high-level schematic functional block diagram of the first embodiment of sound broadcasting device of the present invention;

Fig. 4 is the high-level schematic functional block diagram of the second embodiment of sound broadcasting device of the present invention;

Fig. 5 is the effect schematic diagram of the preferred embodiment selecting interface in the present invention;

Fig. 6 is the high-level schematic functional block diagram of the 3rd embodiment of sound broadcasting device of the present invention;

Fig. 7 is the effect schematic diagram of the preferred embodiment arranging interface in the present invention;

Fig. 8 is the high-level schematic functional block diagram of the 4th embodiment of sound broadcasting device of the present invention;

Fig. 9 is the schematic flow sheet of the first embodiment of voice broadcast method of the present invention;

Figure 10 is the schematic flow sheet of the second embodiment of voice broadcast method of the present invention;

Figure 11 is the schematic flow sheet that the present invention's vocal print feature corresponding according to described speech roles carries out the preferred embodiment of voice broadcast step;

Figure 12 is the schematic flow sheet of the 3rd embodiment of voice broadcast method of the present invention;

Figure 13 is the schematic flow sheet of the 4th embodiment of voice broadcast method of the present invention.

The realization of the object of the invention, functional characteristics and advantage will in conjunction with the embodiments, are described further with reference to accompanying drawing.

Embodiment

Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.

The mobile terminal realizing each embodiment of the present invention is described referring now to accompanying drawing.In follow-up description, use the suffix of such as " module ", " parts " or " unit " for representing element only in order to be conducive to explanation of the present invention, itself is specific meaning not.Therefore, " module " and " parts " can mixedly use.

Mobile terminal can be implemented in a variety of manners.Such as, the terminal described in the present invention can comprise the such as mobile terminal of mobile phone, smart phone, notebook computer, digit broadcasting receiver, PDA (personal digital assistant), PAD (panel computer), PMP (portable media player), guider etc. and the fixed terminal of such as digital TV, desktop computer etc.Below, suppose that terminal is mobile terminal.But it will be appreciated by those skilled in the art that except the element except being used in particular for mobile object, structure according to the embodiment of the present invention also can be applied to the terminal of fixed type.

Fig. 1 is the hardware configuration signal of the mobile terminal realizing each embodiment of the present invention.

Mobile terminal 100 can comprise wireless communication unit 110, A/V (audio/video) input unit 120, user input unit 130, sensing cell 140, output unit 150, memory 160, interface unit 170, controller 180 and power subsystem 190 etc.Fig. 1 shows the mobile terminal with various assembly, it should be understood that, does not require to implement all assemblies illustrated.Can alternatively implement more or less assembly.Will be discussed in more detail below the element of mobile terminal.

Wireless communication unit 110 generally includes one or more assembly, and it allows the radio communication between mobile terminal 100 and wireless communication system or network.Such as, wireless communication unit can comprise at least one in broadcast reception module 111, mobile communication module 112, wireless Internet module 113, short range communication module 114 and positional information module 115.

Broadcast reception module 111 via broadcast channel from external broadcasting management server receiving broadcast signal and/or broadcast related information.Broadcast channel can comprise satellite channel and/or terrestrial channel.Broadcast management server can be generate and send the server of broadcast singal and/or broadcast related information or the broadcast singal generated before receiving and/or broadcast related information and send it to the server of terminal.Broadcast singal can comprise TV broadcast singal, radio signals, data broadcasting signal etc.And broadcast singal may further include the broadcast singal combined with TV or radio signals.Broadcast related information also can provide via mobile communications network, and in this case, broadcast related information can be received by mobile communication module 112.Broadcast singal can exist in a variety of manners, such as, it can exist with the form of the electronic service guidebooks (ESG) of the electronic program guides of DMB (DMB) (EPG), digital video broadcast-handheld (DVB-H) etc.Broadcast reception module 111 can by using the broadcast of various types of broadcast system Received signal strength.Especially, broadcast reception module 111 can by using such as multimedia broadcasting-ground (DMB-T), DMB-satellite (DMB-S), digital video broadcasting-hand-held (DVB-H), forward link media (MediaFLO ) the digit broadcasting system receiving digital broadcast of Radio Data System, received terrestrial digital broadcasting integrated service (ISDB-T) etc.Broadcast reception module 111 can be constructed to be applicable to providing the various broadcast system of broadcast singal and above-mentioned digit broadcasting system.The broadcast singal received via broadcast reception module 111 and/or broadcast related information can be stored in memory 160 (or storage medium of other type).

Radio signal is sent at least one in base station (such as, access point, Node B etc.), exterior terminal and server and/or receives radio signals from it by mobile communication module 112.Various types of data that such radio signal can comprise voice call signal, video calling signal or send according to text and/or Multimedia Message and/or receive.

Wireless Internet module 113 supports the Wi-Fi (Wireless Internet Access) of mobile terminal.This module can be inner or be externally couple to terminal.Wi-Fi (Wireless Internet Access) technology involved by this module can comprise WLAN (WLAN) (Wi-Fi), Wibro (WiMAX), Wimax (worldwide interoperability for microwave access), HSDPA (high-speed downlink packet access) etc.

Short range communication module 114 is the modules for supporting junction service.Some examples of short-range communication technology comprise bluetooth ^tM, radio-frequency (RF) identification (RFID), Infrared Data Association (IrDA), ultra broadband (UWB), purple honeybee ^tMetc..

Positional information module 115 is the modules of positional information for checking or obtain mobile terminal.The typical case of positional information module is GPS (global positioning system).According to current technology, GPS module 115 calculates from the range information of three or more satellite and correct time information and for the Information application triangulation calculated, thus calculates three-dimensional current location information according to longitude, latitude and pin-point accuracy.Current, the method for calculating location and temporal information uses three satellites and by the error of the position that uses an other satellite correction calculation to go out and temporal information.In addition, GPS module 115 can carry out computational speed information by Continuous plus current location information in real time.

A/V input unit 120 is for audio reception or vision signal.A/V input unit 120 can comprise camera 121 and microphone 122, and the view data of camera 121 to the static images obtained by image capture apparatus in Video Capture pattern or image capture mode or video processes.Picture frame after process may be displayed on display unit 151.Picture frame after camera 121 processes can be stored in memory 160 (or other storage medium) or via wireless communication unit 110 and send, and can provide two or more cameras 121 according to the structure of mobile terminal.Such acoustic processing can via microphones sound (voice data) in telephone calling model, logging mode, speech recognition mode etc. operational mode, and can be voice data by microphone 122.Audio frequency (voice) data after process can be converted to the formatted output that can be sent to mobile communication base station via mobile communication module 112 when telephone calling model.Microphone 122 can be implemented various types of noise and eliminate (or suppress) algorithm and receiving and sending to eliminate (or suppression) noise or interference that produce in the process of audio signal.

User input unit 130 can generate key input data to control the various operations of mobile terminal according to the order of user's input.User input unit 130 allows user to input various types of information, and keyboard, the young sheet of pot, touch pad (such as, detecting the touch-sensitive assembly of the change of the resistance, pressure, electric capacity etc. that cause owing to being touched), roller, rocking bar etc. can be comprised.Especially, when touch pad is superimposed upon on display unit 151 as a layer, touch-screen can be formed.

Sensing cell 140 detects the current state of mobile terminal 100, (such as, mobile terminal 100 open or close state), the position of mobile terminal 100, user for mobile terminal 100 contact (namely, touch input) presence or absence, the orientation of mobile terminal 100, the acceleration or deceleration of mobile terminal 100 move and direction etc., and generate order or the signal of the operation for controlling mobile terminal 100.Such as, when mobile terminal 100 is embodied as sliding-type mobile phone, sensing cell 140 can sense this sliding-type phone and open or close.In addition, whether whether sensing cell 140 can detect power subsystem 190 provides electric power or interface unit 170 to couple with external device (ED).Sensing cell 140 can comprise proximity transducer 141 and will be described this in conjunction with touch-screen below.

Interface unit 170 is used as at least one external device (ED) and is connected the interface that can pass through with mobile terminal 100.Such as, external device (ED) can comprise wired or wireless head-band earphone port, external power source (or battery charger) port, wired or wireless FPDP, memory card port, for connecting the port, audio frequency I/O (I/O) port, video i/o port, ear port etc. of the device with identification module.Identification module can be that storage uses the various information of mobile terminal 100 for authentication of users and can comprise subscriber identification module (UIM), client identification module (SIM), Universal Subscriber identification module (USIM) etc.In addition, the device (hereinafter referred to " recognition device ") with identification module can take the form of smart card, and therefore, recognition device can be connected with mobile terminal 100 via port or other jockey.Interface unit 170 may be used for receive from external device (ED) input (such as, data message, electric power etc.) and the input received be transferred to the one or more element in mobile terminal 100 or may be used for transmitting data between mobile terminal and external device (ED).

In addition, when mobile terminal 100 is connected with external base, interface unit 170 can be used as to allow by it electric power to be provided to the path of mobile terminal 100 from base or can be used as the path that allows to be transferred to mobile terminal by it from the various command signals of base input.The various command signal inputted from base or electric power can be used as and identify whether mobile terminal is arranged on the signal base exactly.Output unit 150 is constructed to provide output signal (such as, audio signal, vision signal, alarm signal, vibration signal etc.) with vision, audio frequency and/or tactile manner.Output unit 150 can comprise display unit 151, dio Output Modules 152, alarm unit 153 etc.

Display unit 151 may be displayed on the information of process in mobile terminal 100.Such as, when mobile terminal 100 is in telephone calling model, display unit 151 can show with call or other communicate (such as, text messaging, multimedia file are downloaded etc.) be correlated with user interface (UI) or graphic user interface (GUI).When mobile terminal 100 is in video calling pattern or image capture mode, display unit 151 can the image of display capture and/or the image of reception, UI or GUI that video or image and correlation function are shown etc.

Meanwhile, when display unit 151 and touch pad as a layer superposed on one another to form touch-screen time, display unit 151 can be used as input unit and output device.Display unit 151 can comprise at least one in liquid crystal display (LCD), thin-film transistor LCD (TFT-LCD), Organic Light Emitting Diode (OLED) display, flexible display, three-dimensional (3D) display etc.Some in these displays can be constructed to transparence and watch from outside to allow user, and this can be called transparent display, and typical transparent display can be such as TOLED (transparent organic light emitting diode) display etc.According to the specific execution mode wanted, mobile terminal 100 can comprise two or more display units (or other display unit), such as, mobile terminal can comprise outernal display unit (not shown) and inner display unit (not shown).Touch-screen can be used for detecting touch input pressure and touch input position and touch and inputs area.

When dio Output Modules 152 can be under the isotypes such as call signal receiving mode, call mode, logging mode, speech recognition mode, broadcast reception mode at mobile terminal, voice data convert audio signals that is that wireless communication unit 110 is received or that store in memory 160 and exporting as sound.And dio Output Modules 152 can provide the audio frequency relevant to the specific function that mobile terminal 100 performs to export (such as, call signal receives sound, message sink sound etc.).Dio Output Modules 152 can comprise loud speaker, buzzer etc.

Alarm unit 153 can provide and export that event informed to mobile terminal 100.Typical event can comprise calling reception, message sink, key signals input, touch input etc.Except audio or video exports, alarm unit 153 can provide in a different manner and export with the generation of notification event.Such as, alarm unit 153 can provide output with the form of vibration, when receive calling, message or some other enter communication (incomingcommunication) time, alarm unit 153 can provide sense of touch to export (that is, vibrating) to notify to user.By providing such sense of touch to export, even if when the mobile phone of user is in the pocket of user, user also can identify the generation of various event.Alarm unit 153 also can provide the output of the generation of notification event via display unit 151 or dio Output Modules 152.

Memory 160 software program that can store process and the control operation performed by controller 180 etc., or temporarily can store oneself through exporting the data (such as, telephone directory, message, still image, video etc.) that maybe will export.And, memory 160 can store about when touch be applied to touch-screen time the vibration of various modes that exports and the data of audio signal.

Memory 160 can comprise the storage medium of at least one type, described storage medium comprises flash memory, hard disk, multimedia card, card-type memory (such as, SD or DX memory etc.), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), Electrically Erasable Read Only Memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc.And mobile terminal 100 can be connected the memory function of execute store 160 network storage device with by network cooperates.

Controller 180 controls the overall operation of mobile terminal usually.Such as, controller 180 performs the control relevant to voice call, data communication, video calling etc. and process.In addition, controller 180 can comprise the multi-media module 181 for reproducing (or playback) multi-medium data, and multi-media module 181 can be configured in controller 180, or can be configured to be separated with controller 180.Controller 180 can pattern recognition process, is identified as character or image so that input is drawn in the handwriting input performed on the touchscreen or picture.

Power subsystem 190 receives external power or internal power and provides each element of operation and the suitable electric power needed for assembly under the control of controller 180.

Various execution mode described herein can to use such as computer software, the computer-readable medium of hardware or its any combination implements.For hardware implementation, execution mode described herein can by using application-specific IC (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, being designed at least one performed in the electronic unit of function described herein and implementing, in some cases, such execution mode can be implemented in controller 180.For implement software, the execution mode of such as process or function can be implemented with allowing the independent software module performing at least one function or operation.Software code can be implemented by the software application (or program) write with any suitable programming language, and software code can be stored in memory 160 and to be performed by controller 180.

So far, oneself is through the mobile terminal according to its functional description.Below, for the sake of brevity, by the slide type mobile terminal that describes in various types of mobile terminals of such as folded form, board-type, oscillating-type, slide type mobile terminal etc. exemplarily.Therefore, the present invention can be applied to the mobile terminal of any type, and is not limited to slide type mobile terminal.

Mobile terminal 100 as shown in Figure 1 can be constructed to utilize and send the such as wired and wireless communication system of data via frame or grouping and satellite-based communication system operates.

Describe wherein according to the communication system that mobile terminal of the present invention can operate referring now to Fig. 2.

Such communication system can use different air interfaces and/or physical layer.Such as, the air interface used by communication system comprises such as frequency division multiple access (FDMA), time division multiple access (TDMA), code division multiple access (CDMA) and universal mobile telecommunications system (UMTS) (especially, Long Term Evolution (LTE)), global system for mobile communications (GSM) etc.As non-limiting example, description below relates to cdma communication system, but such instruction is equally applicable to the system of other type.

With reference to figure 2, cdma wireless communication system can comprise multiple mobile terminal 100, multiple base station (BS) 270, base station controller (BSC) 275 and mobile switching centre (MSC) 280MSC280 and be constructed to form interface with Public Switched Telephony Network (PSTN) 290.MSC280 is also constructed to form interface with the BSC275 that can be couple to base station 270 via back haul link.Back haul link can construct according to any one in some interfaces that oneself knows, described interface comprises such as E1/T1, ATM, IP, PPP, frame relay, HDSL, ADSL or xDSL.Will be appreciated that system as shown in Figure 2 can comprise multiple BSC275.

Each BS270 can serve one or more subregion (or region), by multidirectional antenna or point to specific direction each subregion of antenna cover radially away from BS270.Or each subregion can by two or more antenna covers for diversity reception.Each BS270 can be constructed to support multiple parallel compensate, and each parallel compensate has specific frequency spectrum (such as, 1.25MHz, 5MHz etc.).

Subregion can be called as CDMA Channel with intersecting of parallel compensate.BS270 also can be called as base station transceiver subsystem (BTS) or other equivalent terms.Under these circumstances, term " base station " may be used for broadly representing single BSC275 and at least one BS270.Base station also can be called as " cellular station ".Or each subregion of particular B S270 can be called as multiple cellular station.

As shown in Figure 2, broadcast singal is sent to the mobile terminal 100 at operate within systems by broadcsting transmitter (BT) 295.Broadcast reception module 111 as shown in Figure 1 is arranged on mobile terminal 100 and sentences the broadcast singal receiving and sent by BT295.In fig. 2, several global positioning system (GPS) satellite 300 is shown.Satellite 300 helps at least one in the multiple mobile terminal 100 in location.

In fig. 2, depict multiple satellite 300, but be understandable that, the satellite of any number can be utilized to obtain useful locating information.GPS module 115 as shown in Figure 1 is constructed to coordinate to obtain the locating information wanted with satellite 300 usually.Substitute GPS tracking technique or outside GPS tracking technique, can use can other technology of position of tracking mobile terminal.In addition, at least one gps satellite 300 optionally or extraly can process satellite dmb transmission.

As a typical operation of wireless communication system, BS270 receives the reverse link signal from various mobile terminal 100.Mobile terminal 100 participates in call usually, information receiving and transmitting communicates with other type.Each reverse link signal that certain base station 270 receives is processed by particular B S270.The data obtained are forwarded to relevant BSC275.BSC provides call Resourse Distribute and comprises the mobile management function of coordination of the soft switching process between BS270.The data received also are routed to MSC280 by BSC275, and it is provided for the extra route service forming interface with PSTN290.Similarly, PSTN290 and MSC280 forms interface, and MSC and BSC275 forms interface, and BSC275 correspondingly control BS270 so that forward link signals is sent to mobile terminal 100.

Based on above-mentioned mobile terminal hardware configuration and communication system, each embodiment of sound broadcasting device of the present invention is proposed.

With reference to the high-level schematic functional block diagram that Fig. 3, Fig. 3 are the first embodiment of sound broadcasting device of the present invention.

In the present embodiment, described sound broadcasting device comprises: extraction module 10, relating module 20 and report module 30;

Described extraction module 10, for gathering sample sound, and extracts vocal print feature from described sample sound;

By being provided for the shortcut icon of voiceprint extraction, voiceprint extraction instruction can be triggered for user based on described shortcut icon, opening voiceprint extraction function, starting to gather sample sound, and extract vocal print feature from described sample sound; Or, also by being provided for the physical button of voiceprint extraction, voiceprint extraction instruction can be triggered for user based on described physical button, opening voiceprint extraction function, starting to gather sample sound; Or, also can by being provided for the touch key-press of voiceprint extraction, when detecting the touch operation of user based on described touch key-press, triggering described voiceprint extraction instruction, opening voiceprint extraction function, start to gather sample sound.

Can when detecting voice call, the voice data transmitted by audio input interface acquisition opposite end, using obtained voice data as sample sound; Or, also when detecting sound-recording function and opening, the voice data of microphone transmission can be obtained by audio input interface, using obtained voice data as sample sound.Such as: at PCM (PulseCodeModulation, pulse code modulation) audio interface place, carried out the collection of sample sound by the method copied, the copy copying a voice data carries out the extraction of vocal print feature as sample sound.

Preferably, when detecting the call with contact person, the voice data of received described contact person can be copied; To the copy of the voice data obtained be copied as sample sound; Vocal print feature is extracted from described sample sound.

Can the vocal print feature of extraction be saved in memory, carry out voice broadcast for follow-up according to described vocal print feature.Further, according to the voice data obtained during voice call as sample sound, carry out voiceprint extraction according to described sample sound, the associated person information of described voice call can be obtained, and described associated person information and described vocal print feature correspondence are saved in memory.

Described relating module 20, for setting up the mapping relations of described vocal print feature and corresponding relationship people;

Described contact person, for identifying described vocal print feature, so that identify each vocal print feature.Preferably, described contact person can be the known connection people stored; Or, also can be the unknown contacts based on arranging interface interpolation.

Interface can be set by providing, for user, contact person corresponding to described vocal print feature be set based on the described interface that arranges; Receive user based on described arrange interface trigger be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

Or, also can when detecting voice call, the voice data transmitted by described voice call opposite end gathers sample sound, and vocal print feature is extracted from described sample sound, and obtain the associated person information of described voice call further, set up the mapping relations of described vocal print feature and described contact person.Further, if when obtaining the associated person information failure of described voice call, arranging interface by providing, arranging contact person corresponding to described vocal print feature for user based on the described interface that arranges; Receive user based on described arrange interface trigger be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

Described report module 30, for described contact person is added into selective listing as speech roles, selects speech roles for user based on described selective listing, carries out voice broadcast with the vocal print feature corresponding according to described speech roles.

Described contact person is added into selective listing as speech roles, speech roles is selected based on described selective listing for user, and the vocal print feature corresponding according to user-selected speech roles carries out phonetic synthesis, and carry out the voice broadcast with described contact person's sound characteristic.

The present embodiment is by extracting vocal print feature in the sample sound of collection automatically, and set up the mapping relations of described vocal print feature and contact person, the contact person of correspondence is added into selective listing as speech roles, select speech roles for user based on described selective listing, carry out voice broadcast with the vocal print feature corresponding according to described speech roles.Achieve and provide more speech roles to select for user, make the speech roles of voice broadcast have more selectivity, improve Consumer's Experience.

With reference to the high-level schematic functional block diagram that Fig. 4, Fig. 4 are the second embodiment of sound broadcasting device of the present invention.Based on the first embodiment of above-mentioned sound broadcasting device, described report module 30 comprises adding device 31, selected cell 32, determining unit 33 and reports unit 34;

Described adding device 31, for being added into described selective listing using described contact person as speech roles;

Described selected cell 32, selecting selective listing described in interface display for providing, selecting speech roles for user based on described selective listing;

Described determining unit 33, for when receiving user and completing instruction based on the selection that described selection interface is triggered, determines the vocal print feature of selected speech roles and correspondence;

Described report unit 34, carries out voice broadcast for the vocal print feature corresponding according to described speech roles.

Described contact person is added into described selective listing as speech roles by terminal; There is provided and select selective listing described in interface display, select speech roles for user based on described selective listing; When receiving user and completing instruction based on the selection that described selection interface is triggered, determine the vocal print feature of selected speech roles and correspondence; The vocal print feature corresponding according to described speech roles carries out voice broadcast.

With reference to the effect schematic diagram that Fig. 5, Fig. 5 are the preferred embodiment selecting interface in the present invention.Concrete, such as: role c, role d are added into speech roles list by terminal; There is provided and select speech roles list described in interface display, select speech roles for user based on described speech roles list; When receiving the determination instruction that user triggers based on described selection interface, determine the vocal print feature of selected role and correspondence; The vocal print feature corresponding according to described role carries out voice broadcast.

Described selective listing is the speech roles list that described terminal is supported, for selecting the speech roles of voice broadcast based on described selective listing for user.Containing the speech roles preset (such as: the Default sound role importing vocal print feature when terminal is dispatched from the factory), also the speech roles automatically added can be contained in described speech roles list.

Further, described report unit 34, also waits to report text for determining, and waits the RP reporting text described in synthesis;

Described report unit 34, also for modifying described RP according to described vocal print feature, obtains the sound waveform of described contact person's pronunciation character;

Described report unit 34, also carries out voice broadcast for exporting described sound waveform.

Carry out the process of voice broadcast, can determine to wait to report text, and described in synthesis, wait the RP reporting text; According to described vocal print feature, described RP is modified, obtain the sound waveform of described contact person's pronunciation character; Export described sound waveform and carry out voice broadcast.Preferably, described sound waveform can be sent to audio output interface (as MIC, earphone, HDMI, the loud speaker etc. that carries) to report.

The present embodiment is by extracting vocal print feature in the sample sound of collection automatically, and set up the mapping relations of described vocal print feature and contact person, the contact person of correspondence is added into selective listing as speech roles, select speech roles for user based on described selective listing, carry out the voice broadcast of described contact person's pronunciation character with the vocal print feature corresponding according to described speech roles.Achieve and provide more speech roles to select for user, make the speech roles of voice broadcast have more selectivity, improve Consumer's Experience.

With reference to the high-level schematic functional block diagram that Fig. 6, Fig. 6 are the 3rd embodiment of sound broadcasting device of the present invention.Based on the first embodiment of above-mentioned sound broadcasting device, described relating module 20 comprises setting unit 21 and builds receipts or other documents in duplicate unit 22;

Described setting unit 21, arranges interface for providing, and arranges contact person corresponding to described vocal print feature for user based on the described interface that arranges;

Describedly build receipts or other documents in duplicate unit 22, for receive user based on described arrange that interface triggers be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

After extract vocal print feature from described sample sound, provide and interface is set, for user, contact person corresponding to described vocal print feature is set based on the described interface that arranges; Receive user based on described arrange interface trigger be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user, and the contact person of described vocal print feature and correspondence are saved in memory.

With reference to the effect schematic diagram that Fig. 7, Fig. 7 are the preferred embodiment arranging interface in the present invention.Concrete, such as: from sample sound, extract vocal print feature 4, eject and interface is set and input frame is provided, input the contact person of vocal print feature 4 correspondence for user; When receiving the determination instruction that user triggers based on described interface, setting up the mapping relations of vocal print feature 4 and corresponding relationship people according to inputted contact person, and the contact person of vocal print feature 4 and correspondence is saved in memory.

The present embodiment arranges interface and arranges contact person corresponding to described vocal print feature by providing by user, set up the mapping relations of described vocal print feature and contact person, the contact person of correspondence is added into selective listing as speech roles, select speech roles for user based on described selective listing, carry out voice broadcast with the vocal print feature corresponding according to described speech roles.Achieve and provide more speech roles to select for user, make the speech roles of voice broadcast have more selectivity, improve Consumer's Experience.

With reference to the high-level schematic functional block diagram that Fig. 8, Fig. 8 are the 4th embodiment of sound broadcasting device of the present invention.Based on the first embodiment of above-mentioned sound broadcasting device, described extraction module 10 comprises copied cells 11 and extraction unit 12;

Described copied cells 11, for when detecting the call with contact person, copies the voice data of received described contact person;

Described copied cells 11, also for copying the copy of the voice data obtained as sample sound;

Described extraction unit 12, for extracting vocal print feature from described sample sound.

By the talking state of monitor terminal, when detecting the call with contact person, the voice data of received described contact person can be copied; To the copy of the voice data obtained be copied as sample sound; Vocal print feature is extracted from described sample sound.

The technology extracting vocal print feature from described sample sound can include but not limited to enumerate as follows: fundamental tone frequency spectrum and profile, the energy of fundamental tone frame, the frequency of occurrences of fundamental tone formant and track thereof; Linear prediction cepstrum coefficient, line spectrum pair, auto-correlation and log area ratio, MFCC (MelFrequencyCepstrumCoefficient, Mel frequency cepstral coefficient), perception linear prediction; Wavelet transformation technique etc.

The present invention further provides a kind of voice broadcast method.

With reference to the schematic flow sheet that Fig. 9, Fig. 9 are the first embodiment of voice broadcast method of the present invention.

In the present embodiment, described voice broadcast method comprises the following steps:

Step S10, gathers sample sound, and extract vocal print feature from described sample sound;

Step S20, sets up the mapping relations of described vocal print feature and corresponding relationship people;

Step S30, is added into selective listing using described contact person as speech roles, selects speech roles, carry out voice broadcast with the vocal print feature corresponding according to described speech roles for user based on described selective listing.

With reference to the schematic flow sheet that Figure 10, Figure 10 are the second embodiment of voice broadcast method of the present invention.Based on the first embodiment of above-mentioned voice broadcast method, described step S30 comprises:

Step S31, is added into described selective listing using described contact person as speech roles;

Step S32, provides and selects selective listing described in interface display, selects speech roles for user based on described selective listing;

Step S33, when receiving user and completing instruction based on the selection that described selection interface is triggered, determines the vocal print feature of selected speech roles and correspondence;

Step S34, the vocal print feature corresponding according to described speech roles carries out voice broadcast.

Further, be the schematic flow sheet that the present invention's vocal print feature corresponding according to described speech roles carries out the preferred embodiment of voice broadcast step with reference to Figure 11, Figure 11;

Step S340, determines to wait to report text, and waits the RP reporting text described in synthesis;

Step S341, modifies described RP according to described vocal print feature, obtains the sound waveform of described contact person's pronunciation character;

Step S342, exports described sound waveform and carries out voice broadcast.

With reference to the schematic flow sheet that Figure 12, Figure 12 are the 3rd embodiment of voice broadcast method of the present invention.Based on the first embodiment of above-mentioned voice broadcast method, described step S20 comprises:

Step S21, provides and arranges interface, arranges contact person corresponding to described vocal print feature for user based on the described interface that arranges;

Step S22, receive user based on described arrange interface trigger be provided with instruction time, the mapping relations setting up described vocal print feature and corresponding relationship people are set according to received user.

With reference to the schematic flow sheet that Figure 13, Figure 13 are the 4th embodiment of voice broadcast method of the present invention.Based on the first embodiment of above-mentioned voice broadcast method, described step S10 comprises:

Step S11, when detecting the call with contact person, copies the voice data of received described contact person;

Step S12, will copy the copy of the voice data obtained as sample sound;

Step S12, extracts vocal print feature from described sample sound.

It should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or device and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or device.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the device comprising this key element and also there is other identical element.

The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be well understood to the mode that above-described embodiment method can add required general hardware platform by software and realize, hardware can certainly be passed through, but in a lot of situation, the former is better execution mode.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product is stored in a storage medium (as ROM/RAM, magnetic disc, CD), comprising some instructions in order to make a station terminal equipment (can be mobile phone, computer, server, air conditioner, or the network equipment etc.) perform method described in each embodiment of the present invention.

These are only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize specification of the present invention and accompanying drawing content to do equivalent structure or equivalent flow process conversion; or be directly or indirectly used in other relevant technical fields, be all in like manner included in scope of patent protection of the present invention.

Claims

1. a sound broadcasting device, is characterized in that, described sound broadcasting device comprises:

2. sound broadcasting device as claimed in claim 1, is characterized in that, described report module comprises adding device, selected cell, determining unit and reports unit;

3. sound broadcasting device as claimed in claim 2, is characterized in that, described report unit, also waits to report text for determining, and waits the RP reporting text described in synthesis;

4. sound broadcasting device as claimed in claim 1, it is characterized in that, described relating module comprises setting unit and builds receipts or other documents in duplicate unit;

5. the sound broadcasting device as described in any one of Claims 1-4, is characterized in that, described extraction module comprises copied cells and extraction unit;

6. a voice broadcast method, is characterized in that, described voice broadcast method comprises the following steps:

7. voice broadcast method as claimed in claim 6, it is characterized in that, described described contact person is added into selective listing as speech roles, select speech roles for user based on described selective listing, the step of carrying out voice broadcast with the vocal print feature corresponding according to described speech roles comprises:

8. voice broadcast method as claimed in claim 7, it is characterized in that, the step that the described vocal print feature corresponding according to described speech roles carries out voice broadcast comprises:

Export described sound waveform and carry out voice broadcast.

9. voice broadcast method as claimed in claim 6, it is characterized in that, the described step setting up the mapping relations of described vocal print feature and corresponding relationship people comprises:

10. the voice broadcast method as described in any one of claim 6 to 9, is characterized in that, described collection sample sound, and the step extracting vocal print feature from described sample sound comprises:

To the copy of the voice data obtained be copied as sample sound;

Vocal print feature is extracted from described sample sound.