CN103531224A

CN103531224A - Simple voice playing method applied in embedded system platform

Info

Publication number: CN103531224A
Application number: CN201310460892.9A
Authority: CN
Inventors: 周宇
Original assignee: Elefirst Science & Tech Co Ltd
Current assignee: Elefirst Science & Tech Co Ltd
Priority date: 2013-09-30
Filing date: 2013-09-30
Publication date: 2014-01-22
Anticipated expiration: 2033-09-30
Also published as: CN103531224B

Abstract

The invention relates to a simple voice playing method applied in an embedded system platform. The method includes the steps: a target text is converted into a WAV format file; sampled data of the WAV format file is sent to a DA converter by a processor; digital-analog conversion of the sampled data is carried out by the DA converter, then is sent to an audio amplifier for processing, and is played through a loudspeaker. With the use of the method, the processor utilization rate can be effectively lowered, requirements on the processor are reduced, and the hardware cost of an embedded system platform voice playing system is saved.

Description

Be applied to the simple and easy speech playing method of embedded system platform

Technical field

The present invention relates to a kind of speech playing method, especially a kind of simple and easy speech playing method that is applied to embedded system platform.

Background technology

In recent years, along with electronic technology fast development, embedded system platform is applied in field more and more widely, and also from strength to strength, complexity is also more and more higher for function, and embedded system platform speech play technology has also experienced the development in a plurality of stages.

According to the knowledge of the applicant, common embedded system platform speech playing method has three kinds conventionally at present:

(1) by voice recording and playback ic (as ISD4003 series), convert needed speech data to numerical information in advance, and deposit to chip internal; During broadcasting, digital signal is restored into voice.The advantage of this method is: during recording, use high sampling rate can obtain high-quality, natural voice; But its shortcoming is: a. needs professional sound pick-up outfit and playback environ-ment during recording, otherwise can introduce intolerable neighbourhood noise; B. high sampling rate needs more storage spaces, and corresponding chip price is expensive with it.

(2) adopt professional chipspeech (as interrogated the XFS5051CE flying in University of Science and Technology), the speech synthesis technique based on advanced, receives text to be synthesized (comprising Chinese) by UART interface, then text is directly synthesized to voice output file.The advantage of this method is: be simple and easy to use, optional multiple pronunciation style (comprising dialect), can, based on special linguistic context optimization, make voice warmer, natural; But its shortcoming is: chip kind is few, expensive.

(3) software decode and play-back technology (as common GPS Voice Navigation, Mp 3 player etc.), deposit the audio file recording to device interior memory block in advance, passes through software decode while needing, and coordinate hardware output voice.The advantage of this method is: low price, and can play in theory the audio file of any form; But its shortcoming is: software decode need to take a large amount of processor (CPU or MCU) resource, very high to the requirement of processor.

With regard to current actual service condition, most of occasions are not high to the requirement of voice quality, and this just needs a kind of, process easier, versatility high and with low cost embedded system platform speech playing method lower to hardware requirement badly.

Summary of the invention

Technical matters to be solved by this invention is: the problem existing for prior art, a kind of simple and easy speech playing method that is applied to embedded system platform is proposed, and can reduce processor occupancy, reduce the requirement to processor.

The technical scheme that the present invention solves its technical matters is as follows:

A kind of simple and easy speech playing method that is applied to embedded system platform, it is characterized in that, be applied to the embedded system platform language play back system that contains processor, FLASH storer, RAM storer, DA converter, note amplifier and loudspeaker, described processor is connected with FLASH storer, RAM memory data respectively, described processor is connected with DA converter data, and described DA converter is connected with speaker data through note amplifier;

Described speech playing method comprises the following steps:

The first step, target text is converted to 8KHz sampling rate, 16Bit, the non-compression pcm encoder of monaural standard formatted file, contains the WAV formatted file of file header and sampled data, in described file header, contain reference position and the length of sampled data; Then this WAV formatted file is put into FLASH storer, pending device calls;

Second step, described processor find corresponding WAV formatted file in FLASH storer according to instruction, and the file header of this WAV formatted file is called in RAM storer; Described processor is known sampled data reference position and length by reading the file header of this WAV formatted file, and processor is directly sent to DA converter by the sampled data of this WAV formatted file by DMA transmission mode from FLASH storer;

The 3rd step, described DA converter are undertaken sending to note amplifier after digital-to-analog conversion by sampled data described in second step and are processed, and play by loudspeaker.

The major technique design of technique scheme is as follows:

Applicant thinks, if will reduce processor occupancy, adopts special voice recording and playback ic or chipspeech can cause undoubtedly with high costs, can only aspect software, look for another way.Applicant is through in depth finding after practical studies, while utilizing WAV formatted file to play without the feature of decoding, should be able to effectively reduce processor occupancy, 8KHz sampling rate, 16Bit, monaural WAV formatted file not only can meet the needs of most of occasions completely in voice quality simultaneously, and its file volume is also in tolerance interval; In addition adopt DMA transmission technology can further reduce the occupancy of processor.Based on this important discovery, applicant has finally drawn the technical scheme of above-mentioned optimization organic assembling after further practical studies, can effectively reduce processor occupancy, reduces the hardware cost of embedded system platform.

Preferably, also comprise that voice merge broadcast multiple segments voice separately process:

In the first step, if exist a plurality of playout length to be less than the WAV formatted file of predetermined value, take one of them WAV formatted file as benchmark file, the end that sampled data in all the other WAV formatted files is added on successively to this benchmark file forms new file, the title of each sampled data of adding, reference position skew are formed to data block together with lengths sets simultaneously, then this data block is added into the end of new file, forms and merge WAV formatted file; In the file header of described merging WAV formatted file, contain merged file identifier; This merging WAV formatted file is put into FLASH storer, and pending device calls;

In second step, processor knows that by reading the merged file identifier of merging WAV formatted file file header this document is to merge WAV formatted file, then processor reads in the data block of this merging WAV formatted file end in RAM storer and by title and finds the reference position of target sampled data to be offset and length, and then processor is sent to DA converter with DMA transmission mode by target sampled data.

Can further dwindle audio file volume like this, save storage space.

Preferably, in the first step, first target text is converted to MP3 format audio file, then this MP3 format audio file is converted to WAV formatted file.Can utilize existing switching software can obtain the WAV formatted file corresponding with target text completely like this, not need to develop again new switching software.

Preferably, in second step, processor is being sent to sampled data before DA converter, first judge whether sampled data length is greater than DMA transmission maximum single and sends length, if processor is sent to sampled data DA converter in batches with interrupt mode, DA converter directly sent to sampled data by processor if not.

More preferably, in second step, processor has the I2S module containing register, and processor first writes sampled data reference position and length in I2S module register, and then starts DMA transmission; If sampled data length is greater than DMA transmission, maximum single sends length, and processor transmits in batches with interrupt mode, and while transmitting, processor first writes this reference position of sampled data and length in I2S module register at every turn, then starts this DMA transmission.

Can farthest make good use of DMA transmission technology like this, reduce the workload of processor as far as possible.

Preferably, described processor is MCU.

Adopt after the inventive method, can effectively reduce processor occupancy, reduce the requirement to processor, save the hardware cost of embedded system platform language play back system.

Accompanying drawing explanation

Fig. 1 is that the hardware of the embodiment of the present invention forms schematic diagram.

Fig. 2 is the main process schematic diagram of Fig. 1 embodiment.

Embodiment

With reference to the accompanying drawings and in conjunction with the embodiments the present invention is described in further detail.But the invention is not restricted to given example.

Embodiment

The present embodiment is applied to the simple and easy speech playing method of embedded system platform, be applied to the embedded system platform language play back system (as shown in Figure 1) that contains processor (as MCU), FLASH storer, RAM storer, DA converter, note amplifier and loudspeaker, processor is connected with FLASH storer, RAM memory data respectively, processor is connected with DA converter data, and DA converter is connected with speaker data through note amplifier.

As shown in Figure 2, speech playing method comprises the following steps:

The first step, first target text is converted to MP3 format audio file, again this MP3 format audio file is converted to 8KHz sampling rate, 16Bit, the non-compression pcm encoder of monaural standard formatted file, the WAV formatted file that contains file header and sampled data, contains reference position and the length of sampled data in file header; Then this WAV formatted file is put into FLASH storer, pending device calls;

Second step, processor find corresponding WAV formatted file in FLASH storer according to instruction, and the file header of this WAV formatted file is called in RAM storer; Processor is known sampled data reference position and length by reading the file header of this WAV formatted file, and processor is directly sent to DA converter by the sampled data of this WAV formatted file by DMA transmission mode from FLASH storer;

Particularly, processor is being sent to sampled data before DA converter, first judge whether sampled data length is greater than DMA transmission maximum single and sends length, if processor is sent to sampled data DA converter in batches with interrupt mode, DA converter directly sent to sampled data by processor if not.

Situation is more specifically: processor has the I2S module containing register, and processor first writes sampled data reference position and length in I2S module register, and then starts DMA transmission; If sampled data length is greater than DMA transmission, maximum single sends length, and processor transmits in batches with interrupt mode, and while transmitting, processor first writes this reference position of sampled data and length in I2S module register at every turn, then starts this DMA transmission.

Said method also comprises that voice merge broadcast multiple segments voice separately process:

In addition to the implementation, the present invention can also have other embodiments.All employings are equal to the technical scheme of replacement or equivalent transformation formation, all drop on the protection domain of requirement of the present invention.

Claims

1. a simple and easy speech playing method that is applied to embedded system platform, it is characterized in that, be applied to the embedded system platform language play back system that contains processor, FLASH storer, RAM storer, DA converter, note amplifier and loudspeaker, described processor is connected with FLASH storer, RAM memory data respectively, described processor is connected with DA converter data, and described DA converter is connected with speaker data through note amplifier;

Described speech playing method comprises the following steps:

The 3rd step, described DA converter are undertaken sending to note amplifier after digital-to-analog conversion by sampled data described in second step and are processed, and finally by loudspeaker, play.

2. the simple and easy speech playing method that is applied to according to claim 1 embedded system platform, is characterized in that, also comprises that voice merge broadcast multiple segments voice separately process:

3. according to the simple and easy speech playing method that is applied to embedded system platform described in claim 1 or 2, it is characterized in that, in the first step, first target text is converted to MP3 format audio file, then this MP3 format audio file is converted to WAV formatted file.

4. be applied to according to claim 3 the simple and easy speech playing method of embedded system platform, it is characterized in that, in second step, processor is being sent to sampled data before DA converter, first judge whether sampled data length is greater than DMA transmission maximum single and sends length, if processor is sent to sampled data DA converter in batches with interrupt mode, DA converter directly sent to sampled data by processor if not.

5. be applied to according to claim 4 the simple and easy speech playing method of embedded system platform, it is characterized in that, in second step, processor has the I2S module containing register, processor first writes sampled data reference position and length in I2S module register, and then starts DMA transmission; If sampled data length is greater than DMA transmission, maximum single sends length, and processor transmits in batches with interrupt mode, and while transmitting, processor first writes this reference position of sampled data and length in I2S module register at every turn, then starts this DMA transmission.

6. the simple and easy speech playing method that is applied to according to claim 5 embedded system platform, is characterized in that, described processor is MCU.