CN1267888C - Terminal equipment for executing voice synthesising using phonic recording language - Google Patents

Terminal equipment for executing voice synthesising using phonic recording language Download PDF

Info

Publication number
CN1267888C
CN1267888C CNB2004100029561A CN200410002956A CN1267888C CN 1267888 C CN1267888 C CN 1267888C CN B2004100029561 A CNB2004100029561 A CN B2004100029561A CN 200410002956 A CN200410002956 A CN 200410002956A CN 1267888 C CN1267888 C CN 1267888C
Authority
CN
China
Prior art keywords
email
text
pronunciation
accompanying document
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2004100029561A
Other languages
Chinese (zh)
Other versions
CN1517978A (en
Inventor
山木清志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yamaha Corp
Original Assignee
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp filed Critical Yamaha Corp
Publication of CN1517978A publication Critical patent/CN1517978A/en
Application granted granted Critical
Publication of CN1267888C publication Critical patent/CN1267888C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • G06Q50/50
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation

Abstract

To provide a terminal device synthesizing voice including designated word accent on the basis of an uttering description language described by text for sounding voice on the E-mail receipt side and generating an easily readable E-mail with no discomfortableness to the receipt side.This terminal device is provided with a voice synthesizing means interpreting a special character string based on the voice sounding uttering description language, which is described by text and defines accents and the like in sounding a designated word, and synthesizing voice defined by the special character string. A terminal device on the receipt side interprets the special character string included in the received E-mail to synthesize voice.(C)2004,JPO&NCIPI.

Description

Utilize pronunciation to record and narrate language and carry out the synthetic terminal device of sound
Technical field
The present invention relates to the terminal device that a kind of reception of carrying out data such as Email transmits, particularly utilize the pronunciation descriptive language to carry out the synthetic terminal device of sound.
Background technology
All the time, the article of recording and narrating with textual form being carried out the synthetic technology of sound just is developed, utilizes.At present, the just further exploitation technology that additional its rise and fall (intonation) waits, imitates more natural people's pronunciation in synthetic sound.
In addition, in mobile phone or PC etc.,, the Email of recording and narrating with textual form is carried out sound on one side and synthesize and read aloud on one side can utilize the synthetic reception of reporting Email of given sound.For example, open in the 2002-73507 communique (being designated hereinafter simply as " patent documentation ") the spy of Japan and to have disclosed, in the text of Email, insert accompanying document with reference to using code/text line, after the reading aloud of text text, by with reference to by the accompanying document that constitutes with given code/text line music data corresponding that is inserted into or view data, carry out the regeneration and/or the demonstration of these accompanying document data, thereby pass on the technology that transmits the difficult emotion of passing on (happy degree) with common text.In this patent documentation, will with given code/text line music data corresponding or view data as accompanying document, attach automatically on the Email that should transmit.
Yet, in the technology that above-mentioned patent documentation is put down in writing, making is by the accompanying document that is constituted by preprepared music data or view data corresponding to the given code/text line in the text that is inserted into Email, and this accompanying document transmitted with Email, further, at receiver side according to being inserted into given code/text line in the Email Body, music data or view data that the regeneration accompanying document is represented.Therefore, synthetic relevant with sound, just that text text sound is synthetic, only added the regeneration function of music data or view data.That is, be not to have disclosed so-calledly to rise and fall etc. by transmitting the synthetic words and phrases of any specified voice of side user or its, and at receiver side to its regenerating technique.
Have again, though by in accompanying document, comprising music data or view data, thereby given code/text line is inserted in the Email, but given code/text line that let us not go into the question now is inserted into is the situation that has the text line of implication as language, when for the user, having inserted insignificant control routine, might be corrected as so-called mess code, in addition, also be undesirable in the visuognosis of receiver side.Particularly, be under the situation of small-sized portable terminal such as mobile phone at receiver side, because it is its display frame is little, undesirable especially in visuognosis.Also have, also no problem in visuognosis even the text line that is inserted into has implication as language, but what be reproduced corresponding to this text line is preprepared music data or view data, also can the imagination can cause confusion in the identification at receiver side.
Summary of the invention
The present invention is just at the invention of the problems referred to above, its purpose be to provide a kind of receive expression to record and narrate with textual form and appointed words and phrases on given the sound that waits that rises and falls and synthesized the text data of the pronunciation text line of usefulness, and carry out on appointed words and phrases, having given the text data receiving trap that synthesizes of sound such as rising and falling.
In addition, another object of the present invention is to provide a kind of transmit expression to record and narrate with textual form and appointed words and phrases on given the text data conveyer of the text data of the pronunciation text line that the sound that rises and falls etc. synthesizes usefulness.
For solving above-mentioned problem, text data receiving trap of the present invention, possess: receiving trap, receive Email by communication line, this Email is added with the accompanying document of the text data that comprises expression pronunciation text line, and the pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line; Memory storage, the Email that is added with accompanying document that storage has received; Read-out control device is read accompanying document from memory storage; And speech synthesizing device, the pronunciation text line represented according to the text data that is comprised in the accompanying document of having read synthesizes having given the sound that rises and falls.
For solving above-mentioned problem, text data conveyer of the present invention, comprise: input media, the text data of input expression pronunciation text line in the text of Email, the pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line; File producing device, the text data that will import in the text of Email is made into the accompanying document of Email; Memory storage, storage is added with the Email of accompanying document; And conveyer, transmit the Email that is added with the accompanying document of in memory storage, storing by communication line.
As mentioned above, can provide a kind of the utilization to carry out the synthetic terminal device of sound, language is recorded and narrated in the pronunciation of the synthetic usefulness of this sound, not only comprise the words and phrases of appointment at the receiver side of Email but also comprise that it rises and falls, and can carry out sound and synthesize, have again, read the text of the Email that receiver side is transmitted easily, can not give the declaimer discomfort.
In terminal device of the present invention, it is characterized in that, by the pronunciation text line of importing, during the text of sound Email, carry out desirable sound and synthesize according to given pronunciation record language.Promptly, possess in the above-mentioned terminal device: in text, according to words and phrases (each literal or each phoneme of regulation to appointment, under the situation of Japanese for kana text etc.) pronunciation of rise and fall (tone) given sound pronunciation usefulness of waiting records and narrates language and explains the pronunciation text line automatically, will carry out the speech synthesizing device that sound synthesizes with this sound that pronounces the text line regulation.Therefore, if conveyer side operation terminal device, make the Email that comprises the text line that pronounce and transmit, then in the terminal device of recipient's side, detect automatically and explain the pronunciation text line that comprises in this Email, carry out sound and synthesize.
Under above-mentioned situation, the pronunciation text line is recorded and narrated in the text of Email, is written in the accompanying document automatically then.In addition, the pronunciation text line is specified its scope by special use control literal, and when the automatic making of accompanying document, the pronunciation text line is deleted automatically from the text of Email, simultaneously special use is controlled literal and is replaced into desirable picture writing automatically.Thus, owing in the text of Email, visually deleted the pronunciation text line, when the recipient receives Email and shows, eliminated the discomfort of chaotic (psychological) in the visual identity.In addition, because the accompanying document of having recorded and narrated the pronunciation text line, automatically explained in real time and synthesized by sound during by sound at the text of Email, so the recipient can hear the sound of rising and falling of following that conveyer is intended to etc.
Have again, in the pronunciation text line, be formed on intention and give the text structure that disposes given mark (English numeral, special character etc.) before or after the literal that waits of rising and falling (or phoneme etc.), as detecting the device of explaining relevant mark, can just set the more simple table of comparisons etc. and carry out sound control, can in existing portable terminal etc., realize function of the present invention more simply.
Description of drawings
Fig. 1 is the block diagram of formation of the mobile phone of expression the preferred embodiments of the present invention.
Fig. 2 represents in the present embodiment example as the text of the Email of the article that comprises the text line that pronounces.
Fig. 3 A is the guide look of the pronunciation control command of the pronunciation of expression in the present embodiment typical example of recording and narrating the various marks that use in the language.
Fig. 3 B represents according to the tonal variations state in the beginning part of each literal of the pronunciation control command of above-mentioned mark.
Fig. 3 C represents the pronunciation medium pitch variable condition according to each literal of the pronunciation control command of above-mentioned mark.
Fig. 4 A represents to comprise the making example of Email of the pronunciation text line of HV-Script.
Fig. 4 B is illustrated in the above-mentioned Email with the special-purpose state of controlling literal clamping HV-Script.
Fig. 4 C represents the HV-Script in the above-mentioned Email is separated as accompanying document, and with the state of its special-purpose control literal as the picture writing demonstration.
Fig. 5 is the process flow diagram of the action of the transmission side shifting telephone set in the expression present embodiment.
Fig. 6 is the process flow diagram of the action of the receiver side mobile phone in the expression present embodiment.
Embodiment
Below, with reference to accompanying drawing, describe the present invention in detail with embodiment.
Fig. 1 is the block diagram of formation of the mobile phone of expression the preferred embodiments of the present invention.In addition, the present invention is not limited to mobile phone (cellular phone, cellular phone) etc., also go for PHS (Personal Handy-phone System, the Japan registration trade (brand) name) or portable data assistance (PDA, Personal Digital Assistant, personal digital assistant) and PC etc.
In Fig. 1, label 11 expression CPU (central processing unit), the action of each one by carrying out various programmed control portable terminals 1.Label 12 expression Department of Communication Forces are that the signal that is received by the antenna 12a that loads is carried out demodulation, and will transmit signal modulation rearward-facing antenna 12a and supply with, and play a role as transmitting receiving trap or communicator.
Above-mentioned CPU11 is according to given agreement, to deciphering at the signal via networks such as the Internets of Department of Communication Force 12 receptions, demodulation, the information that comprises in this decoded signal (for example, when the text message of Email etc.) being shown on the picture of display part 21, contain under the situation of pronunciation text line at Email or its accompanying document, carry out sound with 26 pairs of sound of the sound source portion that has the sound complex functionality and synthesize with this pronunciation text line regulation.In addition, the transmission data of Email etc. are carried out the coding of given agreement in CPU11, and after above-mentioned Department of Communication Force 12 modulation, towards the server (when transmitting Email is mail server) of transmission object, 12a transmits to the base station from antenna.
Label 13 expression acoustic processing portions.The voice signal that in above-mentioned Department of Communication Force 12, receives via telephone line etc. by demodulation after, decoded in acoustic processing portion 13, carry out voice output from loudspeaker 14.On the other hand, in acoustic processing portion 13, carried out digital processing by the voice signal of collection sound input, be compressed coding with microphone 15.Then, modulated in Department of Communication Force 12, transmit to the base station that is connected with mobile telephone network through antenna 12a.Acoustic processing portion 13 for example utilizes CELP (CodeExcited Linear Predictive Coding) mode or ADPCM (AdaptiveDifferential Pulse-Code Modulation) mode with voice data high efficiency of compression coding/decoding.
Label 16 expression has the sound source portion of sound complex functionality, as writing letters the sound selecteed music data of regeneration and from loudspeaker 17 output sounds (playback) in advance.In addition, (for example receiving the given voice data corresponding with each phoneme of each literal that constitutes the words and phrases that pronounce, comprising influences tonequality tone (vocal quality, pitches) in the time of etc. parameter), the sound source portion 16 that has the sound complex functionality under the control of CPU11, synthetic this voice data of sound and from loudspeaker 17 output sounds (pronunciation, sounding).Though can at random set for this sound synthesis mode that has in the sound source portion 16 of sound complex functionality, for example CSM (the Carrier Sense Multiple-Access) voice synthesis that discloses in the public clear 58-53351 communique of spy can be applicable to FM (Frequency Modulation) sound source realizes.Label 18 expression operating portions are that the main body (for example, casing, housing) of carrying out from mobile phone 1 goes up the various buttons that comprise alphanumeric key (diagram is omitted) of setting or the input of other input medias.For example, the button operation of the alphanumeric key that the detection user carries out etc. can be imported desirable language such as Japanese, English, Chinese, Korean.In addition, also can utilize and shake dial or touch-screen, other input media such as the tyre keyboard that perhaps is connected with mobile phone is imported desirable literal.
Label 19 expression RAM (Random Access Memory), the mail data storage area of the data of the Email that the storage area of the music data of can set the perform region of above-mentioned CPU11, downloading by communication line from server unit or accompaniment tone the data regeneration of melody (these be used for writing letters) and storage transmit the Email that finishes, the Email of making, received etc.
Label 20 expression ROM (Read Only Memory), storage is carried out mail that the transmission of program, the control Email of the various telephony feature programs of the controls such as transmission reception that CPU11 carries out, auxiliary melody Regeneration Treatment receives and is transmitted various data such as the voice data of receiving function program, the synthetic program of handling of auxiliary sound and other programs and each phoneme or music data.
The display part that label 21 expressions for example are made of LCD (LCD:Liquid Crystal Display), under the control of CPU11, the content of carrying out the demonstration of various menu images and Email shows and the demonstration of the content of operation of aforesaid operations portion 18.
Label 22 expression Vib.s are to substitute the letter sound when letter, by making the body vibration of mobile phone 1, have reported letter to the user.
Have, above-mentioned each functional block is carried out the transmission of data or instruction by bus 30 again.
Next, the pronunciation of the sound pronunciation usefulness that regulation rising and falling during to words and phrases sound the waited pronunciation text line of recording and narrating the language record describes.Fig. 2 represents to comprise the example of Japanese article (that is, Email Body) of text line of pronouncing, and is arranged with literal from left to right in each row.In this article example, the pronunciation text line that " か _ 3 さ Ga ほ ^5 _ 4 い ', 4 ね $2-" of the 3rd row constitutes by with the special use control literal of symbol " A " expression (
Figure C20041000295600081
) specialized range, other literal are common texts.This pronunciation text line " か _ 3 さ Ga ほ ^5 _ 4 い ', 4 ね-" is to utilize additional tone on the pronunciation language of " か さ Ga ほ い ね-", and the pronunciation of carrying out the synthetic usefulness of sound records and narrates that language records and narrates.The mark " ' of Ji Shuing in this example ", " ^ ", " _ ", " $ " etc. are the texts that the classification of additional tone gone up in the literal (being the kana text of Japanese in this example) that is illustrated in pronunciation, are the marks that the literal behind this mark (then being the literal this numerical value after when being right after after this mark to numerical value) is added given stress.Import by operating portion 18 with such literal and the continuously arranged form of mark as input media.
Fig. 3 A represents that above-mentioned pronunciation records and narrates the pronunciation control command of the represented sound of each mark (typical example) in language when synthetic, for example, " ' " be illustrated in sentence head and improve a tone (with reference to the symbol " A " of Fig. 3 B), improve tone (with reference to the symbol " B " of Fig. 3 C) in " ^ " expression pronunciation, " _ " is illustrated in the sentence head and reduces tone (with reference to the symbol " B " of Fig. 3 B), reduce tone (with reference to the symbol " D " of Fig. 3 C) in " $ " expression pronunciation, carry out sound thus and synthesize.In addition, additional after above-mentioned mark have under the situation of numerical value, and this numerical value is specified the variable quantity of additional stress.For example, in " か _ 3 さ Ga ", " さ " only reduces tone the amount of " 3 " at the sentence head, then with the tone pronunciation " Ga " of this reduction, has, because not to " か " special extra token, so refer to tone (pitch) pronunciation of standard again.
Like this, record and narrate with pronunciation under the situation of additional stress (tone) on the literal that language comprises in making the words and phrases of pronunciation, form the text structure of the words and phrases of recording and narrating mark additional shown in Figure 2 before this literal (the further numerical value of the variable quantity of additional representation tone).Have again, in the present embodiment,, except these marks, also can use the mark of the power, speed, tonequality etc. of control sound though only the mark of control tone is described.In addition, this pronunciation text line both can be recorded and narrated in the text of Email as shown in Figure 2, also can record and narrate at the title division of Email, perhaps also can be (for example at given accompanying document, can discern the accompanying document that comprises the text line that pronounces by its extension name) middle this pronunciation text line of recording and narrating, attach in the Email that transmits.
Next, the action to the mobile phone 1 of the present embodiment of above-mentioned formation describes.Have, the action that the action the when transmission of common telephony feature receives or the transmission reception of Email etc. are relevant is a technique known, omits its explanation again.In addition, in the following description, will be called " HV-Script " according to the article that the pronunciation text line of pronunciation record language is recorded and narrated.
(transmitting the action of side)
Conveyer starts the program that transmits Email, on one side operating portion 18 is operated, confirm display part 20 as display device, make the article that Email is used on one side.When conveyer is made Email, record and narrate the HV-Script (with reference to Fig. 4 A) that desires to make its pronunciation here, with the arbitrary site of textual form in Email Body (perhaps the title block of Email also can).At this moment, conveyer is (to represent with symbol " A " among Fig. 4 B with special-purpose control literal Literal) mode of clamping HV-Script is imported.Thus, be made into the Email that comprises HV-Script.In this program, the HV-Script that is made into handles as the accompanying document of the file of relevant Email Body.Like this, though become and when Email is made, record and narrate the special character row of reading as general article difficulty, but by this HV-Script is handled as accompanying document, in accompanying document, carry out data in addition and move, thereby Email itself becomes the Email of reading easily concerning the communication object of receiver side.Thus, the mobile phone 1 of present embodiment just not even in the general mobile phone of the function that does not have present embodiment, be shown as common Email equally, can not become the device of the chaotic sticky feeling of person of reading the letter yet.Have again,, also can not transmit, but be stored in the mail data storage area of RAM19 in making or the data of the Email after making.
Next, if the making of Email finishes, then conveyer carries out operation that desirable object (or address) is transmitted this Email.Should transmit the action of the mobile phone 1 of side with reference to flowchart text shown in Figure 5.
In the mobile phone 1 that transmits side, in step S01, judge whether to carry out the E-mail conveyance operation.That is, till the user carried out the E-mail conveyance operation, routine shown in Figure 5 was in holding state.
If conveyer carries out the E-mail conveyance operation, then the judged result of step S01 is a "Yes", and flow process is shifted to step S02.In step S02, judge whether contain special-purpose control literal in the Email of having made.Do not contain in Email under the situation of special-purpose control literal, judged result is a "No", and flow process shifts to step S03.In step S03, carry out the transmission of Email and handle, finish the routine of Fig. 5 then.
Shown in Fig. 4 B, because the Email that has been made into is is in the present embodiment recorded and narrated to comprising special-purpose control literal, so the judged result of step S02 is a "Yes", flow process shifts to step S04.
In step S04, newly make accompanying document.Though the filename of this accompanying document can suitably be set, as its extension name, additional representation comprises the proprietary extensions name (for example, " .hvs ") that the file of HV-Script is used.
Then, in step S05, move the HV-Script that controls the literal clamping with the special use in the Email to accompanying document.That is, will regard HV-Script as, from the article of Email, extract this HV-Script out, in accompanying document, record and narrate storage, simultaneously this HV-Script of deletion from the data of Email with the text line that the literal clamping is controlled in special use.
Next, in step S06, the group change of the special use in Email control literal is replaced into given general picture writing or icon.Use the picture writing of symbol " B " expression here, among Fig. 4 C.
In step S07, the Email that will apply change in step S06 transmits to appointed transmission object (for example, e-mail address) with the accompanying document that is made in above-mentioned steps S04, S05.In addition, the Email that has transmitted is stored in the given memory storage of the mail data storage area in the RAM19 etc.Have again,, also can accept user's given operation and carry out for the processing of step S05, S06.Also have, be stored in the Email in the mobile phone 1 that transmits side, can be the file of Email Body and the unsegregated data mode of accompanying document, perhaps also can be according to above-mentioned processing, the file that Email is separated and links mutually with accompanying document.
(action of receiver side)
Next, with reference to the action of the mobile phone 1 of flowchart text receiver side shown in Figure 6.Here, the mobile phone 1 of receiver side is the device in the memory storage of the mail data storage area of e-mail storage in RAM19 that will receive by its Department of Communication Force 12 etc.
At first, the mobile phone 1 of receiver side judges in step S11 whether the user carried out the display operation of Email.That is, routine shown in Figure 6 is in holding state, up to the display operation of the Email that carries out the user.
If the user carries out the display operation of Email, then the judged result of step S11 is a "Yes", and flow process shifts to step S12.In step S12, under the control of control device (CPU11),, open the file of this Email of reading, and on the picture of display part 21, show this document content from the data that the mail data storage area of RAM19 is read the Email that has received.
Then, in step S13, judge in the Email that receives whether contain HV-Script.This judgement is to carry out according to the group that whether contains special-purpose control literal in the file of Email.When this judged result was "Yes", promptly the file of the Email of Jie Shouing contained the group of special-purpose control literal, had under the situation of HV-Script being judged as to record and narrate, and flow process shifts to step S14.
In step S14, read in the ROM20 and the corresponding voice data of each phoneme that constitutes with each literal of the words and phrases of this HV-Script appointment, again with mark according to additional stress in the words and phrases of being recorded and narrated by this HV-Script, make the sound of pronunciation have the mode of desirable tone, give voice data (at this moment to the sound source that has the sound complex functionality 16 as speech synthesizing device, owing to make variations such as interval, so also have the situation of processing voice data), carry out the control of regulation simultaneously, carry out sound and synthesize.Like this, text at Email contains under the situation of HV-Script, the mobile phone 1 of receiver side is explained (translation) this HV-Script automatically, carry out sound according to this explanation results synthesizes at once, and, the automatically synthetic desirable sound that rises and falls etc. and to be endowed, and make its pronunciation.
Having, is under the situation of "No" in the judged result of step S13 again, that is, be not that flow process shifts to step S15 under the situation of the Email record that contains HV-Script.
In step S15, judge whether the file to Email attaches the accompanying document with above-mentioned " .hvs " extension name.When this judged result is "No", that is, in the file of Email and do not exist in the accompanying document under the situation of HV-Script, it is synthetic not carry out sound, finishes the routine of Fig. 6.
On the other hand, be under the situation of "Yes" in the judged result of step S15, that is, the Email built-in has under the situation of the accompanying document that possesses " .hvs " extension name, and flow process shifts to step S16.
In step S16, judge whether to become the setting of automatically opening accompanying document with " .hvs " extension name according to the mailbox (or Email reading software) that uses.When this judged result is "Yes", that is, when becoming the setting of opening accompanying document automatically, flow process shifts to step S17, opens this accompanying document.Having, is under the situation of "No" in judged result again, that is, be not to open automatically under the situation of setting of accompanying document, and flow process shifts to step S18.
In step S18, judge whether to be undertaken the expansion operation of accompanying document by user (recipient).That is, be in holding state is opened accompanying document up to the recipient operation.Then, if the operation that the recipient opens accompanying document, then the judged result of step S18 is a "Yes", and flow process shifts to step S17, opens this accompanying document.
Behind the step S17, flow process is shifted to step S19, carries out the processing same with above-mentioned steps S14.That is, according to the HV-Script in the accompanying document, carry out sound and synthesize, the stress to the additional appointment of appointed words and phrases pronounces.
In addition, the content of each action step that illustrates in the foregoing description is an example, and content of the present invention is not defined in above-mentioned treatment scheme.For example, in the above-described embodiments, read the Email that has received and carry out sound synthetic though be set at from RAM19, but be not limited to the Email that has received, also can be with the Email that transmit to finish or the Email in making be stored in turn in the RAM19, read and to carry out sound synthetic.Have, the language that is synthesized by sound is not defined in Japanese again, can make in English, other language such as Chinese, Korean yet.
More than, though with reference to accompanying drawing embodiments of the invention are had been described in detail, concrete formation of the present invention is not limited to the foregoing description, comprises the change (modification) that does not break away from the purport scope of the present invention yet.For example, in the above-described embodiments, though what use is to record and narrate language by the pronunciation that text structure constituted that the mark of having used regulation stress before each literal and the numerical value of stipulating the variable quantity of this stress constitute, but need not state outright, the text structure of recording and narrating the language regulation with this pronunciation is not defined in above-mentioned form.For example, can be the text structure of after each literal, recording and narrating above-mentioned mark etc.In addition, when Email is made, though in its text, record and narrate HV-Script, and its pronunciation text line is moved to accompanying document, also can make this accompanying document in addition by recipient self.Have again, there is no need to carry out the setting that accompanying document is opened automatically with mailbox, can in the accompanying document that is transmitted self, comprise this setting in recipient's side.
As above illustrated, the present invention has various effects and technical characterictic, to this following state bright:
(1) according to the present invention, on the basis of the word-information display of in the past Email, logical Cross and import above-mentioned pronunciation text line, can provide pronunciation synthetic height expression effect. In addition, at this In the invention, owing to make in addition the file of having recorded and narrated the pronunciation text line in the conveyer side, make it become electricity The accompanying document of sub-mail is deleted this pronunciation text line simultaneously from the text of this Email, another The aspect, is carried out sound and is synthesized, so use the recipient by opening suitable accompanying document in recipient's side When the terminal of side shows Email, can automatically delete the demonstration of pronunciation text line, therefore, at electricity In the demonstration of sub-mail, the imperfect damage of visual identity can not take place, can will make the person that reads letter sense Feel and prevent trouble before it happens to the situation of the chaotic sticky feeling of visual identity.
(2) in addition, owing to will identify the special use control literal automatic replacing that above-mentioned pronunciation text line is used For general picture writing icon etc., so can be with Email as the comfortable table with pastime heart Existing. Have again, because when recipient's side was opened Email, the pronunciation text line of accompanying document quilt certainly The moving explanation, it is synthetic to carry out desirable sound, so the user of recipient's side self there is no need manually to beat Open accompanying document, in addition, even other literary compositions are made in the record (HV-Script) of the text line that will pronounce Part (accompanying document), the text demonstration of Email is synthesized with the sound of pronunciation text line also can be several Carry out during department.
(3) specifically, in the terminal device that can receive transmission, possess automatic interpretative provision with The sound pronunciation usefulness of rise and fall (tone) during textual form is recorded and narrated and soundization is specified words and phrases etc. The pronunciation text line of language is recorded and narrated in pronunciation, and carries out the synthetic speech synthesizing device of sound, on the other hand, The terminal device of recipient's side is explained the pronunciation text line that comprises in the Email that has received and is carried out Sound is synthetic. In addition, be transmitted if comprise the Email of the pronunciation text line of pronunciation record language, Then the terminal device of recipient's side is explained this pronunciation text line and carries out sound synthetic. The pronunciation note is arranged again Predicate speech, respectively to the words and phrases (text line) that make pronunciation and each literal of consisting of this text line (day Be kana text etc. in the situation of language) add when rising and falling, make the additional mark that this rises and falls of stipulating, note State the text structure of pronunciation text line. Thus, can be in (or 1 phoneme unit of 1 literal unit On) additional rising and falling.
(4) above-mentioned pronunciation text line is stored in the alternative document in the terminal device of conveyer side, Can be used as the accompanying document of Email. In this case, automatic from the text of this Email Deletion pronunciation text line, on the other hand, if in the terminal device of recipient's side, open Email Accompanying document, the pronunciation text line of then recording and narrating in this are explained automatically, carry out sound and synthesize.
(5) in the front and back of above-mentioned pronunciation text line (being the position, the left and right sides of each row in the situation of Japanese) Record and narrate special-purpose control literal. Namely can simply identify the pronunciation text line by detecting special-purpose control literal. In addition, can will should be replaced into the general picture writing that is predetermined by special use control literal. This situation Lower, in the terminal device of recipient's side, owing to the special use in the Email of being stated by the conveyer sidelights The control literal is represented as general picture writing, so can realize having leisurely and carefree sense in recipient's side Comfortable visual effect.
(6) in addition, if Email is opened, then above-mentioned accompanying document is opened automatically, and, Explain the pronunciation text line, carry out sound and synthesize. Therefore, the recipient there is no need self to open subsidiary literary composition Part in addition, even the pronunciation text line is made the accompanying document different from Email Body, also can Substantially side by side the demonstration of carrying out Email Body in real time is synthetic with the sound of pronunciation text line.
The spy that the application has comprised Japan is willing to the content of 2003-18891 number application, and advocates right of priority in view of the above.

Claims (8)

1. text data receiving trap is characterized in that possessing:
Receiving trap, receive Email by communication line, this Email is added with the accompanying document of the text data that comprises expression pronunciation text line, and described pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line;
Memory storage, the Email that is added with described accompanying document that storage has received;
Read-out control device is read described accompanying document from described memory storage; With
Speech synthesizing device according to the represented pronunciation text line of text data that is comprised in the described accompanying document of having read, synthesizes having given the sound that rises and falls.
2. text data receiving trap according to claim 1 is characterized in that, described mark is to give the desirable mark that rises and falls to being right after before or being right after the literal of being recorded and narrated afterwards.
3. text data receiving trap according to claim 1 is characterized in that, the user is when carrying out display operation to this Email, and described control device is read described accompanying document from described memory storage automatically.
4. a text data conveyer is characterized in that, comprising:
Input media, the text data of input expression pronunciation text line in the text of Email, described pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line;
File producing device, the described text data that will import in the text of described Email is made into the accompanying document of described Email;
Memory storage, storage is added with the described Email of described accompanying document; With
Conveyer transmits the described Email that is added with the described accompanying document of storing in described memory storage by communication line.
5. text data conveyer according to claim 4 is characterized in that, by the text data of the described pronunciation text line of expression of described input media input, by special use control literal clamping, is imported by the user.
6. text data conveyer according to claim 5, it is characterized in that, further possesses displacement apparatus, from the text of described Email, detect pronunciation text line by described special-purpose control literal clamping, the described pronunciation text line of deletion from the text of described Email is replaced into general picture writing with described special-purpose control literal.
7. a text data receives the sound method, it is characterized in that:
Receive Email by communication line, this Email is added with the accompanying document of the text data that comprises expression pronunciation text line, and described pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line;
The Email that is added with described accompanying document that storage has received;
Read described accompanying document from described memory storage;
According to the represented pronunciation text line of text data that is comprised in the described accompanying document that has been read out, the sound that rises and falls synthesizes and synthesizes having given the sound that rises and falls.
8. text data transfer approach is characterized in that:
The text data of input expression pronunciation text line in the text of Email, described pronunciation text line is made of with the mark of stipulating that rises and falls when word pronunciation is become sound text line;
The described text data that to import in the text of described Email is made into the accompanying document of described Email;
Storage is added with the described Email of described accompanying document;
Transmit the described Email that is added with the described accompanying document of in described memory storage, storing by communication line.
CNB2004100029561A 2003-01-28 2004-01-21 Terminal equipment for executing voice synthesising using phonic recording language Expired - Fee Related CN1267888C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2003018891A JP4042580B2 (en) 2003-01-28 2003-01-28 Terminal device for speech synthesis using pronunciation description language
JP2003018891 2003-01-28

Publications (2)

Publication Number Publication Date
CN1517978A CN1517978A (en) 2004-08-04
CN1267888C true CN1267888C (en) 2006-08-02

Family

ID=32948904

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100029561A Expired - Fee Related CN1267888C (en) 2003-01-28 2004-01-21 Terminal equipment for executing voice synthesising using phonic recording language

Country Status (4)

Country Link
JP (1) JP4042580B2 (en)
KR (2) KR20040069270A (en)
CN (1) CN1267888C (en)
HK (1) HK1064786A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9741339B2 (en) * 2013-06-28 2017-08-22 Google Inc. Data driven word pronunciation learning and scoring with crowd sourcing based on the word's phonemes pronunciation scores

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2669601B2 (en) * 1994-11-22 1997-10-29 インターナショナル・ビジネス・マシーンズ・コーポレイション Information retrieval method and system
KR100318573B1 (en) * 1996-10-16 2001-12-28 마찌다 가쯔히꼬 Character input apparatus and storage medium in which character input program is stored
JPH11110912A (en) * 1997-09-30 1999-04-23 Sony Corp Device and method for digital signal recording
JP4362899B2 (en) * 1999-08-18 2009-11-11 カシオ計算機株式会社 Data receiving apparatus and storage medium
JP2002073507A (en) * 2000-06-15 2002-03-12 Sharp Corp Electronic mail system and electronic mail device
KR20000063774A (en) * 2000-08-03 2000-11-06 백종관 Method of Converting Text to Voice Using Text to Speech and System thereof
KR20040076051A (en) * 2003-02-24 2004-08-31 엘지전자 주식회사 character input system and the operating method

Also Published As

Publication number Publication date
CN1517978A (en) 2004-08-04
HK1064786A1 (en) 2005-02-04
KR20060093089A (en) 2006-08-23
JP4042580B2 (en) 2008-02-06
KR20040069270A (en) 2004-08-05
JP2004234096A (en) 2004-08-19
KR100754571B1 (en) 2007-09-05

Similar Documents

Publication Publication Date Title
US6985913B2 (en) Electronic book data delivery apparatus, electronic book device and recording medium
US7483832B2 (en) Method and system for customizing voice translation of text to speech
US20060069567A1 (en) Methods, systems, and products for translating text to speech
CN1121108C (en) Portable cellular phone
US7013282B2 (en) System and method for text-to-speech processing in a portable device
US20060074672A1 (en) Speech synthesis apparatus with personalized speech segments
EP1490861A1 (en) Text structure for voice synthesis, voice synthesis method, voice synthesis apparatus, and computer program thereof
EP1071074A3 (en) Speech synthesis employing prosody templates
MXPA04011118A (en) Text-to-speech (tts) for hand-held devices.
CN1412687A (en) Device capable of playing background music and reading electronic book aloud and its method
CN1267888C (en) Terminal equipment for executing voice synthesising using phonic recording language
JP2002196779A (en) Method and apparatus for changing musical sound of sound signal
JP2001051688A (en) Electronic mail reading-aloud device using voice synthesization
JP2003333203A (en) Speech synthesis system, server device, information processing method, recording medium and program
JP4859642B2 (en) Voice information management device
JPH10274999A (en) Document reading-aloud device
CN100369107C (en) Musical tone and speech reproducing device and method
CN1310209C (en) Speech and music regeneration device
JPH11353149A (en) Speech synthesizing device and storage medium
JP2002140086A (en) Device for conversion from short message for portable telephone set into voice output
JP2003173196A (en) Method and apparatus for synthesizing voice
JPH10274998A (en) Method and device for reading document aloud
JP2004282545A (en) Portable terminal
KR20060057134A (en) Mobile communication terminal and method for generating image
JP2005107136A (en) Voice and musical piece reproducing device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1064786

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060802

Termination date: 20150121

EXPY Termination of patent right or utility model