CN113380220B - Speech synthesis coding method and device - Google Patents

Speech synthesis coding method and device Download PDF

Info

Publication number
CN113380220B
CN113380220B CN202110647984.2A CN202110647984A CN113380220B CN 113380220 B CN113380220 B CN 113380220B CN 202110647984 A CN202110647984 A CN 202110647984A CN 113380220 B CN113380220 B CN 113380220B
Authority
CN
China
Prior art keywords
buffer
tblock
playing
stream data
continuous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110647984.2A
Other languages
Chinese (zh)
Other versions
CN113380220A (en
Inventor
皮碧虹
杨德文
龙丁奋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tongxingzhe Technology Co ltd
Original Assignee
Shenzhen Tongxingzhe Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tongxingzhe Technology Co ltd filed Critical Shenzhen Tongxingzhe Technology Co ltd
Priority to CN202110647984.2A priority Critical patent/CN113380220B/en
Publication of CN113380220A publication Critical patent/CN113380220A/en
Application granted granted Critical
Publication of CN113380220B publication Critical patent/CN113380220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

According to the voice synthesis coding method and device provided by one or more embodiments of the present disclosure, after synthesizing text data into pcm stream data, dynamically calculating a start buffer threshold Tstart required for starting playing according to a current system load condition, and if a buffer time length is longer than the start buffer threshold Tstart, reading the pcm stream data of a buffer area for playing; dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition; in the playing process, whether the text data is continuously synthesized into pcm stream data or is paused to be synthesized is judged according to the relation between the buffer duration of the buffer zone and the continuous buffer threshold Tblock, so that the stability and smoothness of playing are ensured, and the smooth occupation of the cpu and the memory is realized.

Description

Speech synthesis coding method and device
Technical Field
The present invention relates to the field of speech synthesis methods, and in particular, to a speech synthesis coding method and apparatus.
Background
The current coding and playing schemes for speech synthesis (text-to-speech) are:
1. and (3) one-time synthesis: inputting the text to a voice synthesis engine, obtaining coded pcm data at one time, and transmitting the pcm data to a player for playing at one time; the mode needs to occupy a large amount of memory to store pcm, the synthesis waiting time is long, and the playing is started after all data are synthesized.
2. Sleep in streaming synthesis: synthesizing pcm data, processing the pcm data by a player, sleeping for a certain time in the synthesis process, and continuing synthesizing and playing; the size of the data block synthesized once in the mode is fixed, the CPU fluctuation exists, the sleep time is too short, the CPU is possibly occupied, and the sleep time is too long, so that the player can be disconnected from broadcasting or noise occurs.
Disclosure of Invention
In view of the foregoing, one or more embodiments of the present disclosure are directed to a speech synthesis coding method and apparatus, which can effectively solve the technical problems in the prior art.
In view of the above object, one or more embodiments of the present specification provide a speech synthesis encoding method, including:
The method comprises the steps of starting to synthesize text data into pcm stream data, and storing the pcm stream data in a buffer area;
dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition;
if the buffer time length of the buffer area is larger than the initial buffer threshold Tstart, reading the pcm stream data of the buffer area to play;
dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition;
If the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, suspending the synthesis of the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time until all the text data are synthesized into pcm stream data.
As an optional implementation manner, the dynamically calculating the starting buffer threshold Tstart required for starting playing according to the current system load condition includes:
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is the synthesis duration prediction;
T2 is the playing time length;
Tmin is the minimum buffer duration.
As an alternative embodiment, t1=l×u/C, t2=l×t;
Wherein, C is the maximum idle calculation force of the single core cpu; u is the calculation power consumption of single word synthesis; t is the duration prediction of the single word; l is the word length of the whole sentence.
As an optional implementation manner, the dynamically calculating the persistent buffer threshold Tblock required for persistent playing according to the current system load condition includes:
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual playing time length estimation, T3 is the residual synthesizing time length estimation, tbuf is the residual playing time length of the current buffer area, and x is the buffer unit.
As an alternative embodiment, t3=r×u/C, t4=r×t+ Tbuf, tmin=f× Tplayer;
Wherein, C is the maximum idle calculation force of the single core cpu; u is the calculation power consumption of single word synthesis; r is the residual word length, F is the minimum play buffer coefficient, tplayer is the minimum buffer length of the player, and T is the single word length estimated.
As an alternative embodiment, the buffer unit x=1% and the minimum play buffer coefficient f=2.
As an alternative embodiment, the method further comprises the step of suspending playing of the pcm stream data.
Corresponding to the speech synthesis coding method, the embodiment of the invention also provides a speech synthesis coding device, which comprises:
The buffer module is used for starting to synthesize the text data into pcm stream data and storing the pcm stream data in a buffer area;
The first calculation module is used for dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition;
the playing module is used for reading the pcm stream data of the buffer area to play when the buffer time of the buffer area is longer than the initial buffer threshold Tstart;
the second calculation module is used for dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition;
The judging module is used for suspending the synthesis of the text data into pcm stream data if the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time until all the text data are synthesized into pcm stream data.
As an alternative embodiment, the first computing module is configured to
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is the synthesis duration prediction;
T2 is the playing time length;
Tmin is the minimum buffer duration.
As an alternative embodiment, the second computing module is configured to
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual playing time length estimation, T3 is the residual synthesizing time length estimation, tbuf is the residual playing time length of the current buffer area, and x is the buffer unit.
As can be seen from the foregoing, in the speech synthesis coding method and apparatus provided in one or more embodiments of the present disclosure, after synthesizing text data into pcm stream data, dynamically calculating an initial buffer threshold Tstart required for starting playing according to a current system load condition, and if a buffer time period is longer than the initial buffer threshold Tstart, reading the pcm stream data in a buffer area to play; dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition; in the playing process, whether the text data is continuously synthesized into pcm stream data or is paused to be synthesized is judged according to the relation between the buffer duration of the buffer zone and the continuous buffer threshold Tblock, so that the stability and smoothness of playing are ensured, and the smooth occupation of the cpu and the memory is realized.
Drawings
For a clearer description of one or more embodiments of the present description or of the solutions of the prior art, the drawings that are necessary for the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only one or more embodiments of the present description, from which other drawings can be obtained, without inventive effort, for a person skilled in the art.
FIG. 1 is a schematic diagram of a speech synthesis encoding method according to an embodiment of the present invention;
Fig. 2 is a schematic diagram of a speech synthesis encoding apparatus according to an embodiment of the invention.
Detailed Description
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
To achieve the above object, an embodiment of the present invention provides a speech synthesis encoding method, including:
The method comprises the steps of starting to synthesize text data into pcm stream data, and storing the pcm stream data in a buffer area;
dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition;
if the buffer time length of the buffer area is larger than the initial buffer threshold Tstart, reading the pcm stream data of the buffer area to play;
dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition;
If the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, suspending the synthesis of the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time until all the text data are synthesized into pcm stream data.
In the embodiment of the invention, after the text data is synthesized into the pcm stream data, dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition, and if the buffer time length of the buffer zone is longer than the starting buffer threshold Tstart, reading the pcm stream data of the buffer zone for playing; dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition; in the playing process, whether the text data is continuously synthesized into pcm stream data or is paused to be synthesized is judged according to the relation between the buffer duration of the buffer zone and the continuous buffer threshold Tblock, so that the stability and smoothness of playing are ensured, and the smooth occupation of the cpu and the memory is realized.
As shown in fig. 1, an embodiment of the present invention provides a speech synthesis coding method, including:
S100, starting to synthesize the text data into pcm stream data, and storing the pcm stream data in a buffer area.
S200, dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition.
Optionally, the dynamically calculating the starting buffer threshold Tstart required for starting playing according to the current system load condition includes:
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is a synthesis duration estimate, t1=l×u/C; t2 is the play duration, t2=l×t; tmin is the minimum buffer duration; c is the maximum idle computing power of the single core cpu; u is the calculation power consumption of single word synthesis; t is the duration prediction of the single word; l is the word length of the whole sentence.
And S300, reading the pcm stream data of the buffer area to play if the buffer time of the buffer area is longer than the initial buffer threshold Tstart.
S400, dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition.
Optionally, the dynamically calculating the persistent buffer threshold Tblock required for persistent playing according to the current system load condition includes:
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual play duration estimate, t4=r×t+ Tbuf, T3 is the residual composite duration estimate, t3=r×u/C, tbuf is the current buffer residual play duration, x is the buffer unit, and the value is usually 1%, the minimum composite buffer duration Tmin, tmin=f× Tplayer; c is the maximum idle computing power of the single core cpu; u is the calculation power consumption of single word synthesis; r is the residual word length, F is the minimum playing buffer coefficient, the value F=2 is usually taken, tlayer is the minimum buffer duration of the player, and T is the single word duration prediction.
S500, if the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, suspending the synthesis of the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time until all the text data are synthesized into pcm stream data.
As an alternative embodiment, the method further comprises the step of suspending playing of the pcm stream data.
Corresponding to the speech synthesis coding method, as shown in fig. 2, an embodiment of the present invention further provides a speech synthesis coding apparatus, including:
a buffer module 10, configured to start synthesizing text data into pcm stream data, and store the pcm stream data in a buffer;
the first calculating module 20 is configured to dynamically calculate an initial buffer threshold Tstart required for playing according to a current system load condition;
the playing module 30 is configured to read the pcm stream data in the buffer area for playing when the buffer time of the buffer area is longer than the initial buffer threshold Tstart;
the second calculating module 40 is configured to dynamically calculate a continuous buffer threshold Tblock required for continuous playing according to a current system load condition;
the judging module 50 is configured to suspend the synthesizing of the text data into pcm stream data if the buffer time period of the buffer area is longer than the continuous buffer threshold Tblock, and return to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time period; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time until all the text data are synthesized into pcm stream data.
Optionally, the first computing module 20 is configured to
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is the synthesis duration prediction;
T2 is the playing time length;
Tmin is the minimum buffer duration.
Optionally, the second computing module 40 is configured to
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual playing time length estimation, T3 is the residual synthesizing time length estimation, tbuf is the residual playing time length of the current buffer area, and x is the buffer unit.
In the embodiment of the invention, after the text data is synthesized into the pcm stream data, dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition, and if the buffer time length of the buffer zone is longer than the starting buffer threshold Tstart, reading the pcm stream data of the buffer zone for playing; dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition; in the playing process, whether the text data is continuously synthesized into pcm stream data or is paused to be synthesized is judged according to the relation between the buffer duration of the buffer zone and the continuous buffer threshold Tblock, so that the stability and smoothness of playing are ensured, and the smooth occupation of the cpu and the memory is realized.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The present disclosure is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the one or more embodiments of the disclosure, are therefore intended to be included within the scope of the disclosure.

Claims (6)

1. A speech synthesis coding method, comprising:
The method comprises the steps of starting to synthesize text data into pcm stream data, and storing the pcm stream data in a buffer area;
dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition;
if the buffer time length of the buffer area is larger than the initial buffer threshold Tstart, reading the pcm stream data of the buffer area to play;
dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition;
If the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, suspending the synthesis of the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time until all the text data are synthesized into pcm stream data; the dynamically calculating the starting buffer threshold Tstart required for starting playing according to the current system load condition includes:
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is the synthesis duration prediction;
T2 is the playing time length;
Tmin is the minimum buffer duration;
the dynamically calculating the continuous buffer threshold Tblock required by continuous playing according to the current system load condition comprises the following steps:
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual playing time length estimation, T3 is the residual synthesizing time length estimation, tbuf is the residual playing time length of the current buffer area, and x is the buffer unit.
2. The speech synthesis coding method according to claim 1, wherein t1=l×u/C, t2=l×t;
Wherein, C is the maximum idle calculation force of the single core cpu; u is the calculation power consumption of single word synthesis; t is the duration prediction of the single word; l is the word length of the whole sentence.
3. The speech synthesis coding method according to claim 1, wherein t3=r x U/C, t4=r x t+ Tbuf, tmin=f x Tplayer;
Wherein, C is the maximum idle calculation force of the single core cpu; u is the calculation power consumption of single word synthesis; r is the residual word length, F is the minimum play buffer coefficient, tplayer is the minimum buffer length of the player, and T is the single word length estimated.
4. A speech synthesis coding method according to claim 3, wherein the buffer unit x = 1% and the minimum play buffer factor F = 2.
5. The speech synthesis coding method according to claim 1, further comprising the step of pausing playing the pcm stream data.
6. A speech synthesis encoding apparatus comprising:
The buffer module is used for starting to synthesize the text data into pcm stream data and storing the pcm stream data in a buffer area;
The first calculation module is used for dynamically calculating a starting buffer threshold Tstart required for starting playing according to the current system load condition;
the playing module is used for reading the pcm stream data of the buffer area to play when the buffer time of the buffer area is longer than the initial buffer threshold Tstart;
the second calculation module is used for dynamically calculating a continuous buffer threshold Tblock required by continuous playing according to the current system load condition;
The judging module is used for suspending the synthesis of the text data into pcm stream data if the buffer time length of the buffer area is larger than the continuous buffer threshold Tblock, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for the preset time; otherwise, continuously synthesizing the text data into pcm stream data, and returning to the step of calculating the continuous buffer threshold Tblock after waiting for a preset time until all the text data are synthesized into pcm stream data;
Wherein the first computing module is used for
Tstart=tmin if T2-T1< Tmin, otherwise tstart=t2-T1;
Wherein T1 is the synthesis duration prediction;
T2 is the playing time length;
tmin is the minimum buffer duration that is required,
The second computing module is used for
If T4 is less than or equal to T3, tblock=t3, otherwise tblock=x (T2-T1) + Tbuf;
If Tblock < Tmin, tblock=tmin;
Wherein, T4 is the residual playing time length estimation, T3 is the residual synthesizing time length estimation, tbuf is the residual playing time length of the current buffer area, and x is the buffer unit.
CN202110647984.2A 2021-06-10 2021-06-10 Speech synthesis coding method and device Active CN113380220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110647984.2A CN113380220B (en) 2021-06-10 2021-06-10 Speech synthesis coding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110647984.2A CN113380220B (en) 2021-06-10 2021-06-10 Speech synthesis coding method and device

Publications (2)

Publication Number Publication Date
CN113380220A CN113380220A (en) 2021-09-10
CN113380220B true CN113380220B (en) 2024-05-14

Family

ID=77573540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110647984.2A Active CN113380220B (en) 2021-06-10 2021-06-10 Speech synthesis coding method and device

Country Status (1)

Country Link
CN (1) CN113380220B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582832A (en) * 2008-05-17 2009-11-18 红杉树(杭州)信息技术有限公司 Method for dynamically processing VoIP jitter buffer area
CN103475934A (en) * 2013-09-13 2013-12-25 北京世纪鼎点软件有限公司 Video coding stream control method facing network live broadcast
CN107959659A (en) * 2016-10-17 2018-04-24 杭州海康威视数字技术股份有限公司 A kind of flow medium play control method, device and electronic equipment
CN109819312A (en) * 2019-03-19 2019-05-28 四川长虹电器股份有限公司 Player system and its control method based on dynamic buffer
CN110351445A (en) * 2019-06-19 2019-10-18 成都康胜思科技有限公司 A kind of high concurrent VOIP recording service system based on intelligent sound identification
CN111105779A (en) * 2020-01-02 2020-05-05 标贝(北京)科技有限公司 Text playing method and device for mobile client
CN111179973A (en) * 2020-01-06 2020-05-19 苏州思必驰信息科技有限公司 Speech synthesis quality evaluation method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7590047B2 (en) * 2005-02-14 2009-09-15 Texas Instruments Incorporated Memory optimization packet loss concealment in a voice over packet network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582832A (en) * 2008-05-17 2009-11-18 红杉树(杭州)信息技术有限公司 Method for dynamically processing VoIP jitter buffer area
CN103475934A (en) * 2013-09-13 2013-12-25 北京世纪鼎点软件有限公司 Video coding stream control method facing network live broadcast
CN107959659A (en) * 2016-10-17 2018-04-24 杭州海康威视数字技术股份有限公司 A kind of flow medium play control method, device and electronic equipment
CN109819312A (en) * 2019-03-19 2019-05-28 四川长虹电器股份有限公司 Player system and its control method based on dynamic buffer
CN110351445A (en) * 2019-06-19 2019-10-18 成都康胜思科技有限公司 A kind of high concurrent VOIP recording service system based on intelligent sound identification
CN111105779A (en) * 2020-01-02 2020-05-05 标贝(北京)科技有限公司 Text playing method and device for mobile client
CN111179973A (en) * 2020-01-06 2020-05-19 苏州思必驰信息科技有限公司 Speech synthesis quality evaluation method and system

Also Published As

Publication number Publication date
CN113380220A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
US8670990B2 (en) Dynamic time scale modification for reduced bit rate audio coding
JP4680429B2 (en) High speed reading control method in text-to-speech converter
CN107527614B (en) Voice control system and method thereof
CN105741838A (en) Voice wakeup method and voice wakeup device
KR20160005050A (en) Adaptive audio frame processing for keyword detection
CN105551512A (en) Audio format conversion method and apparatus
CN113380220B (en) Speech synthesis coding method and device
EP1239462B1 (en) Distributed speech recognition system and method
JP2009175179A (en) Speech recognition device, program and utterance signal extraction method
US7047186B2 (en) Voice decoder, voice decoding method and program for decoding voice signals
CN102860010A (en) Video encoding control method and apparatus
JP4639966B2 (en) Audio data compression method, audio data compression circuit, and audio data expansion circuit
JP2001053869A (en) Voice storing device and voice encoding device
JP2006153907A (en) Audio data encoding device and audio data decoding device
JP4888048B2 (en) Audio signal encoding / decoding method, apparatus and program for implementing the method
KR100895100B1 (en) Method and device for decoding digital audio data
KR100903958B1 (en) Method and device for decoding digital audio data, and record medium for performing method of decoding digital audio data
CN101740075B (en) Audio signal playback apparatus, method, and program
JP2011090483A (en) Information processing apparatus and program
JP3291004B2 (en) Audio coding circuit
KR101265019B1 (en) instruction execution circuit
JP3803306B2 (en) Acoustic signal encoding method, encoder and program thereof
JPH0854895A (en) Reproducing device
JP7035979B2 (en) Speech recognition device
JP5877823B2 (en) Speech recognition apparatus, speech recognition method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant