CN104835491A

CN104835491A - Multiple-transmission-mode text-to-speech (TTS) system and method

Info

Publication number: CN104835491A
Application number: CN201510151399.8A
Authority: CN
Inventors: 陈红波; 邱德才
Original assignee: Chengdu Hui Nong Information Technology Co Ltd
Current assignee: Chengdu Hui Nong Information Technology Co Ltd
Priority date: 2015-04-01
Filing date: 2015-04-01
Publication date: 2015-08-12

Abstract

The invention discloses a multiple-transmission-mode text-to-speech (TTS) system and method. The multiple-transmission-mode TTS system comprises a text message processing module, a communication module, a TTS (text-to-speech) module, an audio power amplification module and a loudspeaker module. The text message processing module is separated from the TTS module in a modularized manner, the communication module transmits a text message which needs to be converted into speech, after the TTS module converts the text into the speech, the TTS module outputs an audio signal, the audio signal is amplified by the audio power amplification module and then is output to the loudspeaker module, and the loudspeaker module emits sounds. In the multiple-transmission-mode TTS system, a main processor is separated from a speech synthesizer in a modularized manner, a TTS speech synthesizer broadcasting system which is suitable for a plurality of occasions and is provided with multiple transmission modes is independently formed, the text message can be directly transmitted by means of a wireless or wired communication mode, and therefore the multiple-transmission-mode TTS system can be applied to various occasions flexibly.

Description

A kind of TTS speech synthesis system of multiple transmission modes and method

Technical field

The invention belongs to acoustics and digital signal processing technique field, be specifically related to a kind of TTS speech synthesis system of multiple transmission modes and the design of method.

Background technology

Phonetic synthesis (Text To Speech), be called for short TTS technology, relating to multiple subject technologies such as acoustics, linguistics, Digital Signal Processing, multimedia technology, is a cutting edge technology in Chinese information processing field.

Phonetic synthesis is exactly process text being converted into voice output, the text of input is mainly decomposed into phoneme by word or word by the work of this process, and want the symbol of special processing to analyze to the numeral in text, monetary unit, word deforming and punctuate etc., and phoneme is generated DAB and then play back with loudspeaker or play with multimedia software after saving as audio files.

Compared with the application program realizing sounding with the audio files prerecorded with some, the Speech Engine of TTS only has several million sizes, does not need a large amount of audio files supports, therefore can save very large storage space, and can read aloud any statement unknown in advance.Also widely, as board information terminal voice broadcast, public transport is called out the stops, attendance recorder, speech electronic book etc. in existing TTS speech synthesis technique application in the market.

Along with the development of modern technologies, communication modes is also varied, differs from one another.Current communication modes can be divided into wire communication transmission mode and wireless communication transmission mode two class substantially.The wire communication transmission mode of existing widespread use has: RS232, RS485, USB, Ethernet etc.; Wireless communication mode has: GPRS/GSM, bluetooth, WIFI, ZigBee, 433MHZ wireless telecommunications etc.The transfer rate of these communication modes also at a few Kbps to tens Mbps not etc., even the transfer rate of a few Kbps has also been more than sufficient concerning transmission Word message, therefore can select different communication modes for different demands.

In prior art, the equipment with voice broadcast is all that voice operation demonstrator is integrated in system equipment inside, needs internal processor to judge and control TTS phonetic synthesis, and can not separately independent carry out remote data transmission again synthetic speech report.In the epoch of modern agriculture Internet of Things high speed development, humiture illumination collector in such as warmhouse booth has collected data, need the keeper mode of these data voice be reported in greenhouse, if the integrated voice operation demonstrator of each collector, not only install in system and bother very much with during operation, also can greatly cost of idleness.

Summary of the invention

The object of the invention is in order to solve in prior art there is voice broadcast equipment can not separately independent carry out remote data transmission again synthetic speech carry out the problem reported, propose a kind of TTS speech synthesis system and method for multiple transmission modes.

Technical scheme of the present invention is: a kind of TTS speech synthesis system of multiple transmission modes, comprises Text extraction module, communication module, TTS voice synthetic module, audio power amplification module and loudspeaker module; Between Text extraction module with TTS voice synthetic module, modularization is separated, and the text message of synthetic speech is needed by communication module transmission, TTS voice synthetic module is by output audio signal after Text-to-speech, and sound signal outputs to loudspeaker module sounding again after audio power amplification module amplifies.

Preferably, communication module comprises wired communication module and wireless communication module.

Preferably, wire communication module comprises RS232 communication module, RS485 communication module, USB communication module, ethernet communication module, and wireless communication module comprises GPRS/GSM wireless module, WIFI wireless module, bluetooth radio module, 2.4G wireless module, 433M wireless module, ZigBee wireless module.

Preferably, TTS voice synthetic module comprises a general wire communication module interface and a wireless universal communication module interface, wiredly turns serial ports and the wireless communication module turning serial ports for grafting is various.

Present invention also offers a kind of TTS phoneme synthesizing method of multiple transmission modes, comprise the following steps:

S1, Text extraction CMOS macro cell need the text message of synthetic speech;

Text message is transferred to TTS voice synthetic module by S2, communication module;

Text message is synthesized voice signal by S3, TTS voice synthetic module;

S4, audio power amplification module carry out power amplification to voice signal;

S5, loudspeaker module play voice.

The invention has the beneficial effects as follows: the primary processor in TTS speech synthesis system is separated with voice synthesizer module by the present invention, the a set of TTS voice operation demonstrator broadcast system being applicable to multiple occasion, there is multiple transmission mode of independent formation, wireless or the wired communication modes of intermediate application directly transmits Word message, make this TTS speech synthesis system can flexible Application in various scene.

Accompanying drawing explanation

Fig. 1 is the TTS speech synthesis system block diagram of a kind of multiple transmission modes provided by the invention.

Fig. 2 is the TTS phoneme synthesizing method process flow diagram of the embodiment of the present invention 1.

Fig. 3 is the TTS phoneme synthesizing method process flow diagram of the embodiment of the present invention 2.

Fig. 4 is the TTS phoneme synthesizing method process flow diagram of the embodiment of the present invention 3.

Embodiment

Below in conjunction with accompanying drawing, embodiments of the invention are further described.

The invention provides a kind of TTS speech synthesis system of multiple transmission modes, as shown in Figure 1, comprise Text extraction module, communication module, TTS voice synthetic module, audio power amplification module and loudspeaker module; Between Text extraction module with TTS voice synthetic module, modularization is separated, and the text message of synthetic speech is needed by communication module transmission, TTS voice synthetic module is by output audio signal after Text-to-speech, and sound signal outputs to loudspeaker module sounding again after audio power amplification module amplifies.

Wherein, Text extraction module is all kinds of equipment that can send Word message, as mobile phone, PC, various data collection stations etc.

Communication module is used for needing the text message of synthetic speech to be transferred to TTS voice synthetic module in Text extraction module, can be divided into wire communication module and wireless communication module.Wire communication module comprises RS232 communication module, RS485 communication module, USB communication module, ethernet communication module; Wireless communication module comprises GPRS/GSM wireless module, WIFI wireless module, bluetooth radio module, 2.4G wireless module, 433M wireless module, ZigBee wireless module.

In order to the text message needing synthetic speech can be obtained from various communication module, the present invention devises a general wire communication module interface and a wireless universal communication module interface in TTS voice synthetic module, wiredly turns serial ports and the wireless communication module turning serial ports for grafting is various.

Along with the development of TTS speech synthesis technique, TTS voice operation demonstrator is constantly integrated, usually TTS voice operation demonstrator is integrated in a very little chip in prior art.In the embodiment of the present invention, TTS voice synthetic module selects SYN6288 voice operation demonstrator, and this chip has following features:

(1) text of GB2312, GBK, BIG5 and UNICODE ISN form is supported;

(2) there is clear, natural, Chinese speech synthesis effect accurately, arbitrary Chinese text can be synthesized, and support the synthesis of English alphabet;

(3) there is the text analyzing Processing Algorithm of intelligence, can correctly identification value, number, Time of Day and conventional weights and measures symbol;

(4) stronger polyphone process and Chinese surname processing power is possessed;

(5) multiple text control mark is supported, to promote the accuracy of text-processing;

(6) amount of text of each synthesis can reach 200 bytes at most;

(7) support various control order, comprise synthesis, stopping, suspending synthesis, continue synthesis, change baud rate etc.;

(8) support sleep mode, can power consumption be reduced in the dormant state; Support the duty of various ways inquiry chip;

(9) support serial data communication interface, support three kinds of communication baud rates: 9600bps, 19200bps, 38400bps;

(10) 16 grades of volume adjustment are supported;

(11) the prospect volume playing text and the background volume playing background music can separately control;

(12) regulating word word speed by sending control mark, supporting 6 grades of word word speed adjustment;

(13) be solidified with how first chord music, prompt sound effect and the common voice message sound for some industry field in chip, comprise 19 first auditory tone cues sounds, 23 first chord prompt tones, 15 first background musics;

(14) chip adopts SSOP paster packing forms, small volume;

(15) chip indices is all satisfied can be applied to outdoor harsh and unforgiving environments.

The sound signal that audio power amplification module exports for amplifying TTS voice synthetic module, then drive loudspeaker module sounding.The input voltage of audio power amplification module is 220V alternating voltage, and output power range is 15 ~ 30W.

Present invention also offers a kind of TTS phoneme synthesizing method of multiple transmission modes, with embodiment 1 ~ embodiment 3, TTS phoneme synthesizing method provided by the invention be further described below:

Embodiment 1:

Text extraction module in the present embodiment selects mobile phone, and as shown in Figure 2, its step is as follows:

S1, mobile phone are written as note form by needing the Word message of synthetic speech;

Text message is transferred to TTS voice synthetic module by S2, employing GPRS/GSM wireless transmission or Bluetooth wireless transmission;

Text message is synthesized voice signal by S3, TTS voice synthetic module;

S5, loudspeaker module play voice.

Embodiment 2:

In the present embodiment, word message processing module selects PC, and as shown in Figure 3, its step is as follows:

AccessPort software in S1, PC is converted to scale-of-two by needing the Word message of synthetic speech;

Text message is transferred to TTS voice synthetic module by S2, employing RS232 serial communication module;

Text message is synthesized voice signal by S3, TTS voice synthetic module;

S5, loudspeaker module play voice.

Embodiment 3:

In the present embodiment, word message processing module selects agriculture Internet of Things collector, and as shown in Figure 4, its step is as follows:

S1, collector collect the environmental informations such as humiture illumination and convert thereof into Word message by internal processor;

S2, adopt any one wired or wireless communication module above-mentioned that text message is transferred to TTS voice synthetic module;

Text message is synthesized voice signal by S3, TTS voice synthetic module;

S5, loudspeaker module play voice.

Those of ordinary skill in the art will appreciate that, embodiment described here is to help reader understanding's principle of the present invention, should be understood to that protection scope of the present invention is not limited to so special statement and embodiment.Those of ordinary skill in the art can make various other various concrete distortion and combination of not departing from essence of the present invention according to these technology enlightenment disclosed by the invention, and these distortion and combination are still in protection scope of the present invention.

Claims

1. a TTS speech synthesis system for multiple transmission modes, is characterized in that, comprises Text extraction module, communication module, TTS voice synthetic module, audio power amplification module and loudspeaker module; Described Text extraction module is separated with modularization between TTS voice synthetic module, and needs the text message of synthetic speech by communication module transmission; Described TTS voice synthetic module is by output audio signal after Text-to-speech, and sound signal outputs to loudspeaker module sounding again after audio power amplification module amplifies.

2. TTS speech synthesis system according to claim 1, is characterized in that, described communication module comprises wired communication module and wireless communication module.

3. TTS speech synthesis system according to claim 2, is characterized in that, described wire communication module comprises RS232 communication module, RS485 communication module, USB communication module, ethernet communication module; Described wireless communication module comprises GPRS/GSM wireless module, WIFI wireless module, bluetooth radio module, 2.4G wireless module, 433M wireless module, ZigBee wireless module.

4. TTS speech synthesis system according to claim 1, it is characterized in that, described TTS voice synthetic module comprises a general wire communication module interface and a wireless universal communication module interface, wiredly turns serial ports and the wireless communication module turning serial ports for grafting is various.

5. a TTS phoneme synthesizing method for multiple transmission modes, is characterized in that, comprise the following steps:

S1, Text extraction CMOS macro cell need the text message of synthetic speech;

Text message is synthesized voice signal by S3, TTS voice synthetic module;

S5, loudspeaker module play voice.