CN1453990A - Sound information system and method - Google Patents

Sound information system and method Download PDF

Info

Publication number
CN1453990A
CN1453990A CN02118393.7A CN02118393A CN1453990A CN 1453990 A CN1453990 A CN 1453990A CN 02118393 A CN02118393 A CN 02118393A CN 1453990 A CN1453990 A CN 1453990A
Authority
CN
China
Prior art keywords
voice
mail
network
email
telephone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN02118393.7A
Other languages
Chinese (zh)
Inventor
黄大威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia of America Corp
Original Assignee
Lucent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lucent Technologies Inc filed Critical Lucent Technologies Inc
Priority to CN02118393.7A priority Critical patent/CN1453990A/en
Priority to US10/134,192 priority patent/US7251314B2/en
Priority claimed from US10/134,192 external-priority patent/US7251314B2/en
Publication of CN1453990A publication Critical patent/CN1453990A/en
Pending legal-status Critical Current

Links

Images

Abstract

In the method, phone number of the receiver and voice information of the sender are recorded in a digital file by voice-mail equipment. The phone number are voice information are packed in an electronic mail and then to be transmitted to another voice-mail equipment used as the gateway through network. The phone number and voice-information of the receiver can be obtained by opening the electronic mail and the voice information can be transmitted to the receiver by dialing phone number of the receiver through a phone network.

Description

Voice message system and method
Technical field
The present invention relates to communication system, more specifically, relate to the communication system of passing through the Network Transmission voice messaging with phone and Email.
Background technology
To the world or long distance communication, those skilled in the art have been developed plurality of communication systems.Wherein a kind of is telephonic communication system traditional, that use public switched telephone network (PSTN).Second kind of system is latest developments, and its internet usage network transferring voice (VoIP) is finished the conversation of voice call.The third system is common mail transfer, and all used by thousands of people its every day.The 4th kind of system is by " Fone2web known to us TM", be also referred to as " Phone2web ".
Known to those skilled in the art, " Fone2web TM" be communication system by the Internet transmission voice, it makes telephone subscriber and personal computer user exchange voice messaging.This system puts up a bridge between telephone subscriber and internet, so that the telephone subscriber can carry out voice communication with the Email user.The Fone2web.com company that is positioned at Silicon Valley, California has developed this communication system, and China Telecom uses the Chinese version of this system.In this system, phone to phone, phone to personal computer, personal computer to phone, can communicate.
In the communication process of phone to phone of this system, voice messaging is passed to other Fone2web by the internet TMThe user.After server receives information, send it to a nearest server, and this voice messaging is stored in recipient's Fone2web from the recipient TMVoice mail.In this system, also can sending to voice messaging in the world with phone, any one has the people of ordinary electronic mailbox.Voice messaging is sent in recipient's the e-mail, and the recipient can see the icon of this voice messaging and click it, " phone-to-web VE mailer at this moment, will occur TM" and play voice messaging.It can give also in the world that any one e-mail sends voice messaging, can also be in audience's tabulation everyone send voice broadcast service.Voice messaging can be stored.
The common world or toll telephone provide convenience, high-quality two-way communication for thousands of daily user.Yet it is expensive that this communication system remains.Because it uses circuit-switched technology, and to use expensive digital switch and local loop carrier wave.The VoIP of another system, use be the packet switching communication mode of less expensive, but need expensive equipment, make it unavoidably cost an arm and a leg in some cases.And, to compare with the world or the toll telephone of standard, its speech quality is poor.
Known to those skilled in the art, Email has become one of application the most successful on the internet.With surface mail relatively, its cheapness, transmit faster.An Email User only needs a computer that links to each other with the internet in some way.Yet it still needs to use a computer to carry out E-mail communication usually.
Recent years, many researchs are devoted to use cell phone or personal digital aid (PDA) (PDA) sends/receive Email Information, so that the user obtains more mobility.But for the mobile subscriber, this is still convenient not as sending voice messaging.Especially because in cell phone or personal digital aid (PDA) (PDA), key in very difficulty of Chinese character.Although nearest Fone2web TMDeveloped by the internet transfers voice information, but this system remains costliness, and the voice messaging that is transmitted need be stored in the voice mail.The recipient must connect this voice mail by calling out, and could reply and obtain prime information.Calling out Call Voicemail is inconvenient, not efficiently to some users.
Summary of the invention
Therefore, one object of the present invention just provides a kind of system and method, makes the user overcome the deficiencies in the prior art, can directly use a computer but can send international or long-distance voice messaging to the recipient by the internet.
System and method of the present invention provides voice-mail equipment to the user, makes them send international or long-distance voice messaging but without the direct control computer to the recipient by the internet.This system provides expense cheap voice communication, and convenient, and is removable, speech quality is high.With Email and Fone2web TMWhat communication system was similar is that system and method for the present invention provides unidirectional communication system.
The present invention has made up the communication between phone-Email-phone.By the connection of this locality to voice-mail equipment, sender and recipient can use common call.Voice messaging transmits between two voice of the present invention-mail equipment by using Email, and this is one aspect of the present invention.
A kind of method of the present invention is by network sender's voice messaging to be sent to the recipient.Voice-mail equipment receives the call from the sender, and recipient's telephone number and sender's voicemail logging in a digital document.This voice messaging and telephone number are encapsulated in the Email, send to another voice-mail equipment from these voice-mail equipment by network then.Open this Email, obtain recipient's telephone number and voice messaging, dial be recipient's telephone number after, voice messaging just can send the recipient to by telephone network.
Another aspect of the present invention is that the user can be cellular radio network voice calls-mail equipment by public switched telephone network.Voice-mail equipment can comprise telephone exchange, cellular basestation or a radio network center.Voice messaging can send the recipient to by public switched telephone network or cellular radio network.The voice messaging that sends to the recipient can be used the digital speech transmission, uses simple mail transfer protocol can pass through network, and for example the internet transmits Email.Voice messaging can be a wave file by voice-mail equipment records.
Another aspect of the present invention is to determine whether sending any one-way voice information according to the recent channel capacity of the prediction of the queue theory model in the Radio Network System.Voice-mail operation of equipment task is as follows: (a) playing alert tones and reception dialing audio signal; (b) record and play voice messaging; (c) search e-mail address; Reach and (d) send and receive Email.Voice-mail equipment utilization recursive least squares and runlength encoding method compressed voice information.
The invention also discloses and a kind ofly send sender's voice messaging to recipient's system by network, it comprises voice-mail equipment as the gateway in the network, is used for receiving by telephone network sender's voice messaging and telephone number.Voice-mail operation of equipment task is as follows: (a) prompting and reception dialing audio frequency; (b) record and play voice messaging; (c) search e-mail address; Reach (d) and send and receive Email to the recipient by internet and telephone network.
Description of drawings
Describe the present invention below in conjunction with accompanying drawing, its purpose, feature, advantage will be shown significantly, in the accompanying drawing:
Fig. 1 has shown the work between in first embodiment of the invention landline telephone or the same voice of the mobile phone-mail equipment, and by the internet according to the smtp protocol send Email;
Fig. 2 be can use in the present invention, voice-mail equipment is carried out the block diagram of high-level description;
Fig. 3 is a high-level flow of having used the method for the present invention of system shown in Figure 1;
Fig. 4 has described another embodiment of the present invention, and it is similar with Fig. 1, but it has shown gateway role between travelling carriage and base station;
Fig. 5 is another block diagram, but its description is how a mail transmission agency who links to each other with internet and gateway changes text in the present invention to speech.
Embodiment
Describe the present invention in more detail below with reference to attached, wherein described the preferred embodiments of the present invention.Yet the present invention can be embodied in a lot of forms, the embodiment that can not only limit to show here.Definite says, utilizes these embodiment, and statement can be more complete more thorough, thereby those skilled in the art is farthest shown the present invention.Numeral identical among Fig. 1 to Fig. 5 is corresponding to the identical parts among the application.
The invention provides the long-distance and international communication of low price, and enjoy mobility and convenience, need not be for carrying out the E-mail communication typing character.The present invention yet supporting cellular phone does not roam by registration.More traditional system comprises that the common world or toll telephone, internet upload sending voice, Email, Fone2web TMCommunication, it has more advantage.
In the present invention, utilize the internet transfers voice information.More generally use the switched circuit in plain old telephone to compare, its cost is lower.Compare with the voice communication that utilizes the networking telephone, it can obtain higher-quality speech, because it uses international standard μ rule (m μ rule) coding.
Known to those skilled in the art, m mu-law encoding is to use the system that quantification is encoded to voice signal.Signal distributions may be more on statistics than being distributed in high level in low level, like this, just than high level more point of quantification arranged near low level.In most of m μ rule system, the line sampling of 14-16bits is mixed into 8bits.This also often is applied in the telephony quality coding and decoding, for example Spark work station speech coder and decoder.
The name of m mu-law encoding is from the application of perception curve and sense of hearing perception studies.It is a non-linear pulse-code modulation at log-domain, and noise and signal strength signal intensity that it adds are proportional.Some file of SunSpark workstations.AU design is a application the most general about m mu-law encoding.For example, 8bit m mu-law encoding meeting a channel blanking of CD voice to 350Kbps.
Known to those skilled in the art, the standard of pulse-code modulation is ITUUG.711, and per 1/8,000 second to level of each sampling point distribution.Have only 8bits to send to each sampling point of coding, thereby have only 256 different level to be encoded.The channel speed of Chan Shenging is 64Kbps like this.Those skilled in the art know that 8-bit m μ rule quality of signals is equal to the quality of the PCM of 12-bit.To changing and showing voice than more responsive at high amplitude, more at the bit number that short arc uses, it is less that high amplitude uses at short arc.This irregular quantification can be realized by several modes, as m μ rule and a rule coding.
When many encoders use-and method that frequency shines upon, for example bank of filters or fast Fourier transform (FFT) are decomposed into subband signal to input signal.Physiology-acoustic model is by consideration subband and primary signal, and use physiology-acoustic information decides the covering thresholding.Each sub-band sample is quantized, encodes to guarantee that quantizing noise is lower than the covering thresholding.These samplings that quantized are combined in the frame that will be determined by decoder.We do not need physiology-acoustic model during decoding, the frame here be opened and sub-band samples decoded.The mapping of frequency-time changes into simple speech output signal to them then.
In the present invention, after wave file was received, sound promptly was played.Like this, not having delay variation takes place.Mistake also seldom appears when crowded.It is very favourable that the present invention is based upon on the e-mail system, because it does not need directly to use a computer.Also do not need typing character, typing character is to the mobile phone user, and especially Zhong Guo mobile phone user is very difficult.It also is very favourable that the present invention is based upon on phone-network communicating system because the invention provides phone-phone communication and needn't be from " extraction " information in certain mail mailbox.
The present invention has used voice-mail equipment, and its effect in the present invention is, even international distance range, the user also can send information to the recipient and needn't directly use a computer by the internet.In to whole specification of the present invention, " voice-mail equipment " are used for describing different processors and various parts, and they have the function that allows the user to send voice mail information in the present invention.Similar with Email and phone-network communicating system, communication means of the present invention also is an one-way communication.It has made up phone-Email-telephone communication, and wherein sender and recipient use the call of standard to be used for the local voice-mail equipment that is connected.Voice-mail equipment can be the part of certain telephone exchange, also may be incorporated in the public switched telephone network.
According to a first aspect of the invention, as shown in Figure 1, mark 10 these communication systems of expression.By Email transmission information between two voice-mails, landline telephone or cell phone are connected to voicemail gateway corresponding to voice-mail functions of the equipments by local PSTN15.Recipient's telephone number and voice messaging all are recorded in the digital document, then voice content and recipient's telephone number are encapsulated in the Email.This Email sends the other end of international or long-distance distance to by internet 16 according to Simple Mail Transfer protocol (SMTP), voice-mail equipment in the voicemail gateway 18 there receives the Email that transmits, and it opens telephone number and the voice messaging that mail is read the recipient therein.Then, the telephone number that it dials the recipient by local telephone network 20 passes to recipient's landline telephone or cell phone 22, submits this voice messaging to.
Known to those skilled in the art, Simple Mail Transfer protocol be based upon on the ICP/IP protocol, between server the agreement of send Email.Most of e-mail systems use SMTP and POP3 (electronic post office's agreement) server by the internet send Email.Collect mail end from POP3 server retrieves information.Usually, when any client email application software disposes, all need to specify POP3 and smtp server.
Shown in the example of the indefiniteness of the present invention of Fig. 2, certain voice-mail equipment 14 comprises a computer processor 30, in this computer processor, comprise the corresponding calculated circuit, also have four main software modules 32,34,36,38, the technology of developing these software modules is well-known.First software module 32 is dialogic softwares, is responsible for playing alert tones and receives the dialing audio frequency.Second software 34 is recorded and is play the decoding method of using among the present invention and compresses the voice messaging of handling.The 3rd software 36 contains a question blank 36a that can find e-mail address according to recipient's countries and regions coding.The 4th software module 38 is responsible for sending and receive Email according to the SMTP/POP3 agreement.
Sound card in the computer or voice modem 40 are by the national region at dialing audio frequency detection of call phone place.It converts analog voice to digital signal and plays audio digital signals then.With a local area network (LAN) network interface card or data modem unit 42 voice-mail equipment and certain email user agent are coupled together.By this sound card or voice modem, these voice-mail equipment is connected to a standard telephone switch 44.Give voice-special service number of mail devices allocation and e-mail address.
With reference now to Fig. 3,, the basic operation of method of the present invention is described.At transmitting terminal, the calling of certain special service number is transferred to voice-mail equipment 14 by switch 44 (block diagram 50).12 groups of users (sender) are special service number, then and voice-mail equipment engage in the dialogue (block diagram 52).The telephone number that first software module 32 in voice-mail equipment requires the sender to key in the recipient comprises country, region area code (block diagram 54).Afterwards, under the prompt tone prompting, begin to record sender's voice messaging (block diagram 56).This information is converted to digital form by sound card/modulator 40, after the processing, is saved as a wave file by second software model 34 (block diagram 58).System at first with sound filtering to increase compression ratio (frame 60).According to international standard μ rule, sound continues to be compressed.Below argumentation also can be detailed explanation based on the undistorted compression of recursive least squares.
The 3rd software module 36 finds corresponding recipient's 22 e-mail address (block diagram 62).The 4th software module 38 encapsulates recipient's telephone number in Email, and the voice messaging wave file as an annex (block diagram 64).It sends this envelope Email according to smtp protocol then.
At receiving terminal, voice-mail equipment 18 receives that envelope Email from the Internet under the local area network (LAN) network interface card or the effect of data modem unit 42 (block diagram 66), the 4th software module 38 passes out recipient's telephone number and also the voice messaging wave file preserved (block diagram 68).Second software module 34 is with this wave file decompression (block diagram 70).First software module, 32 control sound card/modulators 40 are dialed recipient's telephone number (block diagram 72).After obtaining recipient 32 response, voice-mail equipment is play voice messaging (block diagram 74) by sound card/modulator 40.
The system of use voice of the present invention- mail equipment 14,18 also can comprise following function: safety and copyright that password is protected voice messaging (a) are set; (b) in the required time call recipient, to solve sender and recipient's time difference problem; And (c) the multicast voice messaging is given a plurality of recipients.
The recursive least squares that the present invention uses has detailed explanation in article " based on the free of losses compression to M μ rule (A rule) and IMAADPCM of the fast RLS algorithm " lining of author Huang Dawei.The coherent reference that next the present invention contains is all from this article.
Recursive least squares and runlength encoding method can be implemented in the substandard free of losses compression of μ rule (A rule).Lossless compress means a given original input, and what the output signal of generation will be with standard code is just the same, has but reduced the bit rate of compressed file simultaneously.In order to guarantee quality of signals, used with μ rule (A rule) standard in identical quantization method.Use prediction and entropy coding to reduce bit rate.Prediction is to be based upon on quick recurrence least square (RLS) algorithm, and the amount of calculation that needs than existing RLS algorithm is littler.Entropy coding is encoded based on Huffman, and prediction, quantification and coding are combined into the scheme of an adaptation.Each sampling point is reduced bit rate all will guarantee identical quality.Relatively each sampling point 8 bit μ rule or A rule, can be only with the every sampling point of 3.24 bits to 44, the voice signal of 100Hz sample frequency is encoded, with the voice/audio signal encoding of the every sampling point of 4.72 bits to the 11025Hz sample frequency.The IMAADPCM standard also obtains some improvement.
As a setting, μ rule and A rule 1972 ITU (CCITT) G.711 in quilt made referrals in the telephone communication.The front was mentioned, and according to a logarithmic formula, coding standard changes into each sampling point 8 bit to the digital speech/audio signal of each sampling point 16 bit.μ rule (A rule) is used in the world or the toll telephone as international standard.
Those skilled in the art know that present compress technique has been used following three kinds of methods: prediction, quantification and entropy coding.In μ rule and A rule, logarithmic formula has provided quantization step.Log law has been considered human apperceive characteristic, and it converts near the distribution of the Laplace in the primary signal (this has peak value to occur being distributed in zero point) to more smooth distribution.Thereby this Unified coding scheme (to 256 state 8 bits) is near optimum entropy coding.Those skilled in the art to the understanding of RLS algorithm for some time, it is better than some simple Forecasting Methodologies, for example linear sowing square prediction (LMS).But its application is general not as LMS, because its amount of calculation is bigger.The present invention has reduced the amount of calculation of RLS algorithm greatly.
Recursive least squares has used the scheme in conjunction with prediction, quantification and entropy coding.This scheme use and standard in the same quantization method, more complicated prediction and the entropy coding method of usefulness reduces bit rate.Free of losses compression method in traditional pcm encoder is opposite with using, and this scheme adaptability is stronger.In the prior art, there is diverse ways to appear in the prediction of 256 sampling point modules.The coefficient of AR model calculates according to the data in the module, and they are all fixed each module.Coefficient and residual error all are encoded and are used for storage and transmission.
In the recursive least squares that the present invention uses, the calculating of residual error is undertaken by the fallout predictor of front.Fallout predictor upgrades according to observed data at that time, need not encode to predictive coefficient.As long as signal in the past is identical at decoding end and coding side, system just can make following signal and future anticipation device accomplish synchronously at the Code And Decode two ends.The advantage of this method is exactly that system can use a lot of coefficients and needn't increase code length in fallout predictor.Since nonindependence, the especially CD Quality of 44,100 sample frequencys of voice signal, the long efficient that has improved prediction, the bit rate when having reduced of returning to residual coding.
The program of windows platform can realize on these algorithm bases.Can onlinely carry out encoding and decoding, and on a notebook that Pentium233kHz CPU arranged, per second 44,100 sampling points (CD standard) be carried out real time codec the voice/audio signal.
Having as engine μ of windows platform restrained, and A rule and IMA ADPCM condensing routine can be write as with the visual c++ language.These programs can be downloaded, broadcast, encoding and decoding, also can be replicated with the sound wave form.Following chart has shown the result of different samplings.These
Audio samples Hz Size (kBytes) The every u rule of bit Sample ADPCM4
1??????? 44,100 21,189 2.7 2.59
2??????? 44,100 6,306 3.09 2.69
3???????? 44,100 7,529 3.27 2.73
4???????? 44,100 14,077 3.33 3.04
5???????? 44,100 23,224 3.35 2.74
6????????? 44,100 12,536 3.48 2.69
7????????? 44,100 18,907 3.53 2.79
Average speech/audio frequency 3.24 3.75
1??????? 11,025 6,482 4.36 3.08
2???????? 11,025 12,942 4.63 3.19
3???????? 11,025 12,920 4.7 3.22
4???????? 11,025 12,920 4.74 3.15
5???????? 11,025 38,007 4.82 3.17
On average 4.72 3.17
Sample sound is taken from symphony in the music CD (CD), song etc.These voice/audio samples and possible music background from the Internet download, were broadcasted by the Voice of America.
Can it is evident that from chart software is better than sound effect, because audio samples is by 44,100Hz extracts.Prediction to them only shifts to an earlier date 1/44,100 second, and than 11 of speech samples, it is simple that 025Hz realizes wanting.In addition, simple sound than the music of having powerful connections the easier compression of sound.μ rule compression ratio may be because predict than the easier use of the signal RLS that IMA ADPCM reproduces according to the signal of μ rule reproduction than IMA ADPCM height.
Utilize the signal processing technology of development at present, can improve set up, about the compression ratio in the international standard of voice/sound compression, and any loss can not arranged on sound quality.
The front was mentioned, and the present invention uses voice- mail equipment 14,18 and four software models 32,34,36,38 provide one-way communication, gave a repayment with high investment of service provider (ROI).The present invention has expanded applying electronic mail the most successful on the Internet, has used telephone terminal widely, comprises cell phone and telephone, allows the more people to use native system without computer.The present invention especially is fit to China and market, the Far East.
Compare with toll telephone, overseas call, Internet Protocol telephone, system of the present invention does not need bigger investment.The expense of one envelope voice mail can be 1/5 to 1/10 of an Internet Protocol telephone, be China Telecom's market standard overseas call 1/10 to 1/30.
Below chart listed that present China is long-distance, the communication charge standard of the world and Internet Protocol telephone.Can it is evident that price advantage of the present invention place by following table.
The toll telephone of CONTINENTAL AREA OF CHINA 0.7 yuan/minute
VoIP *The CONTINENTAL AREA OF CHINA toll telephone 0.3 yuan/minute
Arrive the toll telephone in Hong Kong, Macao, Taiwan 2.0 yuan/minute
The toll telephone of VoIP CONTINENTAL AREA OF CHINA 1.5 yuan/minute
To the U.S., Canadian overseas call 8.0 yuan/minute
VoIP is to the U.S., Canadian overseas call 2.4 yuan/minute
Overseas call to other countries 8.0 yuan/minute
VoIp is to the overseas call of other countries 3.2 yuan/minute
Local mobile phone 0.4 yuan/minute
The roaming mobile phone 0.6 yuan/minute
Long-distance/international mobile phone Top expense adds corresponding landline telephone expense
*The VoIP expense does not comprise the local call expense, promptly preceding 3 minutes 0.18 yuan, per minute is 0.11 yuan after 3 minutes.U.S. dollar and yuan exchange: 1 U.S. dollar=8.27 yuan.
Compare with the Email of text, system of the present invention has avoided the input of character.This is very beneficial for the user of China, because the input method of Chinese character is very complicated.
Another advantage of the present invention is that it is convenient to realize roaming, cell phone is roamed do not required registration local agent and Foreign Agent.Contrast VoIP, it can provide high-quality speech.When the system among the present invention transmits voice messaging, can not occur pausing and time delay.Voice quality is identical with international standard mu-law encoding.
System of the present invention also can increase the utilance of wireless channel.A service provider can distribute redundant channels to give voice e-mail when redundancy appears in wireless channel, has so just increased channel utilization.The text email that enters into local user also can be delivered on the cell phone.
As previously mentioned, the present invention utilizes common telephone terminal, and the world and the long-distance one-way voice information communication of low price are provided.According to a first aspect of the invention, the present invention does not have substantial change to existing communication system.Software is mounted on some gateways, for example, is positioned on personal computer country variant and city, that have voice modem and the notebook.
According to a second aspect of the invention, at the random two-way access channel of mobile phone, when using expensive wireless communication resources channel redundancy to occur, can be distributed in unidirectional voice e-mail in this channel.Usually, the mobile service provider can only use 50% channel capacity, because existing communication system all is two-way.Real time communication all is required there is not time delay, inserts at random.The 3G (Third Generation) Moblie equipment that is about to emerge is expensive more.Unidirectional voice messaging transmits in the time of can utilizing wireless channel redundancy to occur, that is, transmission priority is lower than common radio communication.The service provider utilizes the but service very easily of this low price, can attract a large amount of user of local telephone services and the world or toll telephone user.
The basic thought that transmits voice messaging on IP is to create a kind of one-way communication to replace complexity and/or the uncontrollable two-way communication in the packet switching system.This thought can expand to other aspects.It (for example, can play an important role in www.nokia.com) in MMS (Multimedia Message Service) on the horizon (MMS).In present and follow-on mobile telephone system, increase the voice messaging propagation function to make full use of channel capacity; The content of Email of hopping send on the office computer or home phone number and mobile phone on these all are wherein examples.
The block diagram that Fig. 4 shows is used for illustrating a second aspect of the present invention.Among the figure, " MS " corresponding to travelling carriage, for example cell phone or other radio telephone (wireless PDA).Base station in " BS " corresponding GSM network also can be the base station of professional person's other networks of advising." RNC " is corresponding to the radio network center among the GPRS/UMTS." RTS " expression request sends, and " CTS " agrees to send.M/M/B is a Ma Shi queuing model, wherein the time of advent and service time obeys index distribution all, and maximum channel capacity is B in the model.Like this, with the queuing model in the Radio Network System to following channel capacity base of prediction on, whether system's decision receives and/or sends any voice messaging, normally one-way voice information.
As shown, the sender also is that travelling carriage 80 has voice e-mail envelope compression or coding.Send request to send signal, receive agree to send signal after, digital speech is sent to the gateway 82 of transmit leg, this gateway can be certain base station or certain radio network center with voice-mail functions of the equipments.Gateway 82 is by the judgement to the M/M/B model, and whether reach has remaining channel capacity to determine whether agreeing request, whether sending signal.Information sends to recipient's gateway 86 by internet 84, and known to the professional person, this gateway also can be certain base station or certain radio network center with voice-mail functions of the equipments.It is by the judgement to the M/M/B model, and whether reach has remaining channel capacity to determine whether transmission information.The voice e-mail form of this digital speech after with uncompressed encoding is sent to recipient's travelling carriage or mobile phone 88.
Known to those skilled in the art, entering certain, to wait in line sequence can be the Ma Shi sequence, can suppose it is plateau, can ignore the non-first in first out standard of waiting for sequence.Wherein some enter sequence can be according to Poisson criterion, delta rate processing, the transmission times of different transmission carry out index according to the average that is provided with and distribute.
Fig. 5 has shown a third aspect of the present invention, promptly provides following service to the user: obtain Email from user's local user, listen to after using text-voice conversion device conversion then, perhaps use cell phone short message (SMS) to watch.Though some 2.5 generations, 3 generation mobile phone can receive Email from user's local user, these expenses are very expensive.And the GSM/IS95 system also can work in the quite a while in future.And landline telephone remains that price is low to be widely used.The present invention can use any general phone to comprise cell phone and landline telephone, receives Email by text-converted voice (TTS) technology of using the development of Bell Laboratory speech model group.The TTS technology also can be widely used in for example more services such as active mobile network, uses it, also can accomplish those not oversize Emails are sent with the cell phone short message.
As shown in Figure 5, mail transmits agency 90 and transmits text email by internet 92, this gateway 94 that voice-mail functions of the equipments are arranged according to (a) channel capacities and (b) the M/M/B model determine whether transmission information.Processor in the gateway uses the TTS technology that text-converted is become voice.Voice e-mail or the SMS form of digital speech after with uncompressed encoding sends travelling carriage/mobile phone 96 to then.
Text-converted becomes the rudimentary algorithm of voice system can be from United States Patent(USP) Nos. 5,751,907to Moebius et al.; 5,790,978 to Oliv et al.; Find among 6,272, the 464 to Kiraz et al..More than all belong to Lucent Technologies.Next the coherent reference that contains in the text is all from above patent.
As everyone knows, Email is with one of text-converted speech synthesis system text formatting with keen competition, because the form that common text often mixes other is form, tabulation or with the artistic pattern that keypad character is made one by one for example, seldom there is regular equipment to make a distinction these regional borders.E-mail text also can comprise various embedding information, transmits material and form, and these all should be detected and be sent out away, helps the listener to receive annex.Also have the address of some trade marks and electronic format correctly not handled in addition by tts system.
In this system of Lucent Technologies's development, character text converts voice (TTS) to can be divided into three main tasks: semantic analysis, rhythm model, and phonetic synthesis.Phonetic synthesis is given voice statement, and for example, a string phonic symbol that has grammer, intonation, emphasizes information is utilized a kind of suitable comprehensive method, converts sound artificial, that machine synthesizes to.Text analysis model from character text by calculating language expression.
The tts system structure has the function of multilingual synthesizer.Effectively English, French, Spanish, Italian, German at present.Russian, Romanian, Chinese, Japanese system.This multi-language system is because the basic software of its semantic analysis, phonetic synthesis is identical to all language, except the English.The information of some special semantemes is necessary; Every kind of a kind of acoustics catalogue of the unique correspondence of language, the special rule of some semantic analyses in addition.But these data leave in external table and the Parameter File, and when being used, tts engine just can be written into them.Like this, in the application of for example dialogue or Email reading etc., just can when operation, sound and language be changed according to original meaning.
The multilingual characteristic of this system can contrast with a text processor, and text processor provides the special language font, allows the user almost can be with any language editing files.Consider text formatting and output, identical basic principle and option are used, and no matter how processed language is now.Unified software structure in the multilingual text-converted sound synthesizer helps promoting the expansion of newspeak, and its modular structure helps increasing integrated between the assembly perfect in the existing system.
Those skilled in the art know that all some language such as Chinese is the border in markers work district not, and system must rebuild these borders.The other language, Russian for example has too complicated morphology, and they must be by appropriate processing, because morphology can influence the expression position of intonation in Russian.In addition, different language use different writing systems usually.In Chinese, system must handle the Chinese character text, and in Russian, must handle the Slavic character of bunchiness.Yet also be possible with a kind of more abstract viewpoint: each semantic analysis problem can be counted as a conversion from a string character (for example pinyin character) to another string character (for example semantic analysis of note).
The character string conversion can utilize FST (FSTs) to simulated calculating.FST can be described to an abstract equipment by its feature, and it comprises the state of limited number.Each state all will be according to a form to the conversion of other state, and which incoming symbol it determines processed.Form also determines the output of symbol.The description of language, for example pronunciation rule is specified by the expert, and it can automatically be compiled FSTs.In the FTSs of weighting (WFTSs), weights (or value) are added into for selecting in the list, make us can do an orderly replacement analysis.Best assay value can be chosen out from the replacement value.
We also can use so-called rhythm model, and here, interval duration module determines a duration to each phoneme segment.On the basis that has been integrated of segment string, each segment is all come mark by a characteristic vector, comprised series of factors in this vector, for example the state of the stress of the sign of segment, syllable, accent, segment content or position in a phrase.An important requirement is that we can calculate these factors from text.Structure duration model can be divided into two stages: to relevant and the statistical analysis and the parameter fitting of voice main body.This system uses one to quantize the duration model, and this model is that wherein the parameter of model meets the speech database of a segmentation as the special case of " with a long-pending " model.This scheme uses statistical technique to handle confused factor and how many levels are these factors have, and also has the problem of data deficiencies.These analyze the pattern that each specific speaker is produced the relevant duration features of a complexity.
This intonation module is calculated a fundamental frequency contour (F0) by the curve relevant with the time that adds three types.These time correlation curves are: phase curve, depend on the phase place type, and for example, be the curve of statement speech to interrogative; The accent curve, each accent (often being the position of depending on stressed syllable and unstressed syllable) all has the accent curve of oneself; Chaotic curve is caught the effect degree of oscillator to tone in the vowel of back consonant.This mode and so-called stack intonation model use with some ideas.It is how to depend on the composition of accent kind and the detailed model of duration that this system can set up the accent curve.Because the listener is very sensitive to the minor variations of fundamental tone in the syllable sequence, this is very important.Previous result aspect time and fundamental tone contour height will be incorporated in the new model.Be similar to structure duration module, the modeling of these dependences is related to parameter configuration to a voice main body.
This system can Drawing upon be together in series the natural-sounding fragment to produce synthetic speech.Most units are diphones in the acoustics storehouse, for example, are included in two transformation units between the language fragments of adjoining, and it starts from the stationary stage of first fragment and ends at stationary stage of second fragment.Based on the various standard that comprises the different and energy of spectral difference tolerance, we select the unit that stores in the sound bank.
On the context or coarctate effect may require to store " whole tone element " unit or triphones unit of context-sensitive.For the structure in acoustics storehouse, this system decides the choice phoneme with an automatic best element selection algorithm.To a given vowel, this mode is selected phoneme to make that spectral difference between common element is different to be minimized by that vowel, and desired element is contained and reached very big.We provide a set of tools to help to reduce artificial participation and have selected element in the acoustics storehouse.The element of choosing the acoustics storehouse was extracted (" shearing ") before this, and its amplitude of standardizing then enrolls index again and stores as acoustics storehouse element in total.
Unit selects with chain module selection and is connected acoustics storehouse element.These required units of module retrieval determine the new duration, and fundamental tone contour and distribution of amplitudes are delivered to synthesis module with these ginsengs with vector form then.Our speech synthesis system uses the linear predictive coding (LPC) and the parameterized vocal cores sound wave of vector quantization to synthesize.
This system can be decomposed into Email Information the following conceptive different stage and analyze: (i) analyze and decompose text filed (marking), (ii) standardization of text (being independent of the translation of device) and (iii) text synthetic (sound translation).
The marking stage is from attempting the key area of identification text.For file header, because clue is quite reliably arranged here, for example with the row of " (From) wherefrom " or " theme (Subject) " beginning, problem is relatively simple.Other situation is much then complicated, for example distinguishes form from plain text or ASCII (ASCII) figure.For complicated situation, the statistical model of system's the main consuming body training comes have the various of weighting to classify to each this generation of composing a piece of writing, these models comprise different symbol class (alphabetic symbol, numeral and pre-defined non-letter character).We give each row and assign to illustrate it and a form, an ASCII icon, a signature, or the similar degree of standard row of a simple text.A text block is defined as any zone that (roughly) defined by one or more blank line.For each text block, system limits all row further must belong to identical class.The class of the highest class of score as this text block selected by system then.After detecting the zone, they are put into the file structure of a branch stratum, the extensive structure language of the standard of deferring to here (SGML), and each node in the layering is indicated tag along sort and attribute.
Device-independently translation attempts making the various type standardization of non-standard text material, comprises e-mail address, unified resource delimiter (URL), and mix trade mark such as WinNT.An example is an e-mail address Brsnyder@netcom.com" @ " should be read as " being " (be not, for example, " at symbol "), and that " " should be read as " point " (should not jumped over).In addition, title brsnyder should disconnect as brsnyder.This standardization is realized by a FST of knowing the e-mail address structure.In this transducer, embedded one via the fine Finite State Model that forms ground English-word main body training.This model will find brsnyder be unlikely be orthographize word and also the suggestion it is decomposed into brsnyder.
In the last stage, the sound translation begins from markedness and standardization text, and inserts suitable label and inform how tts system will pronounce.Except other things, which this translating phase also will determine to use organize by predefined sound in for example zone that quilt is quoted from, make when quilt quote from regional found the time, TTS will be converted to this comparison and can help the listener to understand the sound of the structure of file.
Obviously, the present invention is advanced and provides a kind of system and method to make the user be passed through the long-distance and international voice messaging of the Internet-delivered to the recipient that it has overcome the some shortcomings part of prior art.Voice among the present invention-mail equipment allows the user to give the recipient through these long-distance and international voice messagings of the Internet-delivered and need not directly use a computer.Not only cheaply but also convenient, its was supported mobile communication and the voice communication of high-quality audio signal was provided in this system.It provides a simplex system.Sender and recipient can be with communicating with the standard telephone that voice-mail equipment is connected of locality.As one aspect of the present invention, information mode with Email between two voice of the present invention-mail equipment transmits.
Have the knack of those skilled in the art for one and will from aforementioned and correlation graph, associate the many changes relevant and other concrete modes with the present invention.Therefore, we to understand the present invention be unrestricted in specific description and specific embodiment.The scope of claims has comprised various changes and embodiment.

Claims (25)

1. the voice messaging by the network delivery sender is given recipient's communication means, may further comprise the steps:
In voice-mail equipment, recipient's telephone number and sender's voice messaging is recorded in the digital document;
This telephone number and voice messaging are encapsulated in the Email;
By network, will give another voice-mail equipment from the email delivery of these voice-mail equipment;
Open this Email to obtain recipient's telephone number and voice messaging; And
Dial recipient's telephone number and voice messaging is passed to the recipient via a telephone network.
2. according to the method for claim 1, further comprise step, use PSTN or cellular phone network to call out a voice-mail equipment by the sender.
3. according to the process of claim 1 wherein that voice-mail equipment comprises a telephone exchange.
4. according to the process of claim 1 wherein that voice-mail equipment comprises cellular basestation or radio network center.
5. want 1 method according to right, further comprise step, voice messaging is passed to the recipient by PSTN or cellular phone network.
6. according to the method for claim 1, further comprise step, voice messaging is passed to the recipient in the digital speech mode.
7. according to the method for claim 1, further comprise step, utilize Simple Mail Transfer protocol to pass through the network delivery Email.
8. according to the method for claim 1, further comprising step, is voicemail logging a wave file in voice-mail equipment.
9. according to the method for claim 1, further comprise step, transmit Email with a Mail Transfer Agent.
10. according to the method for claim 9, further comprise step, transmit Wen Wenben spare by the internet from this Mail Transfer Agent.
11. method according to claim 1, further comprise step, whether decision receives and/or sends any one-way voice information, this decision is to be based upon on the base of prediction of following channel capacity, and this prediction is to make according to the queue theory model that Radio Network System is set up.
12. according to the process of claim 1 wherein that voice-mail equipment plays following effect:
A) playing alert tones and reception telephone dial-up signal; B) write down and play a voice messaging; C) find out e-mail address; And d) sends and receives an Email.
13. according to the method for claim 1, further comprise step, in voice-mail equipment, come compressed voice information with recursive least squares and run length coding, RLC.
14. communication system that sender's voice messaging is passed to the recipient by network, comprise voice-mail equipment, as the gateway in the network, be used for receiving sender's voice messaging and telephone number by telephone network, described voice-mail equipment plays following effect:
A) playing alert tones and reception telephone dial-up signal;
B) write down and play a voice messaging;
C) find out e-mail address; And
D) by internet and telephone network the recipient is sent and receives an Email.
15. according to the system of claim 14, wherein telephone network comprises PSTN.
16. according to the system of claim 14, wherein telephone network comprises cellular phone network.
17. according to the system of claim 14, wherein voice-mail equipment comprises telephone exchange.
18. according to the system of claim 14, wherein voice-mail equipment comprises cellular basestation or radio network center.
19. according to the system of claim 14, wherein voice-mail equipment is with wave file form record and broadcast voice messaging.
20. according to the system of claim 14, wherein voice-mail equipment sends arbitrary one-way voice information based on the prediction to following channel capacity, and this prediction is to make according to the queue theory model that Radio Network System is set up.
21., comprise that further a Mail Transfer Agent is used to transmit Email according to the system of claim 14.
22. according to the system of claim 14, wherein voice-mail equipment utilization Simple Mail Transfer protocol is by the network delivery Email.
23. according to the system of claim 14, wherein voice-mail equipment comprises that a processor is used for the conversion of Text To Speech.
24. according to the system of claim 14, wherein voice-mail equipment comprises that a processor is used to utilize recursive least squares to come compressed voice information.
25. according to the method for claim 1, further comprise step, come compressed voice information with recursive least squares.
CN02118393.7A 1994-10-18 2002-04-26 Sound information system and method Pending CN1453990A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN02118393.7A CN1453990A (en) 2002-04-26 2002-04-26 Sound information system and method
US10/134,192 US7251314B2 (en) 1994-10-18 2002-04-29 Voice message transfer between a sender and a receiver

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN02118393.7A CN1453990A (en) 2002-04-26 2002-04-26 Sound information system and method
US10/134,192 US7251314B2 (en) 1994-10-18 2002-04-29 Voice message transfer between a sender and a receiver

Publications (1)

Publication Number Publication Date
CN1453990A true CN1453990A (en) 2003-11-05

Family

ID=32178141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN02118393.7A Pending CN1453990A (en) 1994-10-18 2002-04-26 Sound information system and method

Country Status (1)

Country Link
CN (1) CN1453990A (en)

Similar Documents

Publication Publication Date Title
US7251314B2 (en) Voice message transfer between a sender and a receiver
CN1179324C (en) Method and apparatus for improving voice quality of tandemed vocoders
US6678659B1 (en) System and method of voice information dissemination over a network using semantic representation
CN101095287B (en) Voice service over short message service
EP2205010A1 (en) Messaging
US6385306B1 (en) Audio file transmission method
US20050117564A1 (en) System for sending text messages converted into speech through an internet connection to a telephone and method for running it
CN101141666B (en) Method of converting text note to voice broadcast in mobile phone
US20070112571A1 (en) Speech recognition at a mobile terminal
CN1504056A (en) Mobile communications using wideband terminals allowing tandem-free operation
US7840987B2 (en) Television messaging vocal response generation
CN101069439A (en) Terminal for multimedia ring back tone service and metnod for controlling terminal
CN1329739A (en) Voice control of a user interface to service applications
CN1262577A (en) Method for transmitting data in radio speech channel
CN1731867A (en) Method for transmitting and receiving voice information on mobile terminal
EP1617412B1 (en) Mobile communication device for inserting watermark into voice signal
KR100325986B1 (en) Method and apparatus for sending and receiving multi-media cards using telephone
CN1212604C (en) Speech synthesizer based on variable rate speech coding
CN1980280A (en) Communication system capable of receiving and transmitting voice short message
CN1453990A (en) Sound information system and method
CN1802838A (en) Method and system for transmission of vocal content by MMS
CN101207500B (en) Method for acoustic frequency data inflexion
US20030065512A1 (en) Communication device and a method for transmitting and receiving of natural speech
CN1545248A (en) Method for numbering and resolving Recorded Voice Announcement in network with separated bearing and controlling
CN1168264C (en) Interactive telephone phonetic information service system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication