CN1943191A

CN1943191A - Method and system for sending an audio message

Info

Publication number: CN1943191A
Application number: CNA2005800110848A
Authority: CN
Inventors: E·特伦; T·波尔特勒
Original assignee: Koninklijke Philips Electronics NV
Current assignee: Koninklijke Philips NV
Priority date: 2004-04-13
Filing date: 2005-04-08
Publication date: 2007-04-04
Also published as: KR20060133002A; JP2007533236A; WO2005101259A1; EP1738277A1

Abstract

The invention describes a method for sending an audio message (AM) from a sender (US) to a recipient (UR) over an audio messaging system. Thereby, a sender's (US) audio message is first collected by a transmitting device (2T). The audio message (AM) is then analysed for detection of a control information part (CP) concerning communication specifications of the message (AM) and a main part (MP) comprising the effective message which is to be sent to the recipient (UR). The control information part (CP) of the audio message (AM) is at least partially interpreted for controlling the audio messaging system (1) for communicating the (specific) audio message (AM). At least the main part (MP) of the audio message (AM) is transmitted to a receiving device (3) and presented to the recipient (UR). Furthermore, an appropriate audio messaging system, a transmitting device and a receiving device for such an audio messaging system are described.

Description

Be used to send the method and system of audio message

The present invention relates to a kind of method and a kind of suitable audio message transfer system that is used to send audio message, this method sends to recipient with audio message from transmit leg by the audio message transfer system.The invention further relates to a kind of dispensing device and receiving system that is used for this audio message transfer system.

Be introduced into several years ago since text based message transmits to serve, it is just becoming and is becoming more and more popular.Popular short message transmits service (SMS) example of this service just.Text information system such as the Messenger for PCs of the MSM Messenger of the Instant Messenger of AOL, Microsoft and Yahoo can freely be used after downloading required freeware.Some this message based on PC transmit supplier except text message transmission services is provided, and the voice-enabled chat function also is provided.In addition, some other providers specialize in the voice-enabled chat business, finally bring the design of internet voice (voice-over-IP (Internet Protocol)).

A noticeable difference between voice-enabled chat function and text message transmit is, concerning the user, carrying out explicit is possible alternately, for example by selecting chat window and typewriting therein or by such as writing the word document and with its operation that sends.On the other hand, interactive voice is sent continuously, and unbroken exchange promptly takes place.This usually is that the user is in fact also undesirable, and for example, when he and other people are exactly this situation in the room and when only wanting particular utterance sent as message, and he should not be transmitted usually at other people language in the room.General phone can make the user by covering microphone with hand or phone being switched to quiet this problem that solves.Obviously, this just can not when using hands-free phone or earphone.Message receiver also has similar problem, utilize text based message to transmit service, even when in the same room third party being arranged, also can read private message by screen or display that the reading third party can't see, but guarantee audible messages is not heard it almost is impossible by the third party, unless this message is listened to by headphone.

It seems that the text message transfer system have higher acceptance level than voice-enabled chat function really.This may in fact often not want lasting dialog procedure owing to the user.On the one hand, they wish and can get in touch with other people foundation.On the other hand, they may want to get in touch under off-line mode equally, and under off-line mode, they can not be in the ongoing dialogue enduringly, and their all language all are transmitted in this dialogue.

Therefore, an object of the present invention is to provide a kind of method and a kind of suitable audio message transfer system that is used to send audio message, this method sends to recipient with audio message from transmit leg by the audio message transfer system, and this audio message transfer system provides and the essentially identical experience of text message transfer system for the user.Especially, the user should easily send specific statement as audio message, prevents that simultaneously other statement from being sent by this message transfer service.

For this purpose, the invention provides a kind of being used for by the audio message transfer system audio message to be sent to recipient's method from transmit leg, this method comprises the steps:

At first, the audio message of transmit leg is sent out the device collection.Normally this message produces this message by being said by transmit leg.Yet transmit leg also may otherwise produce the each several part of this message or this message, for example by singing, play an instrument, clapping hands or the like.

Then, this audio message is with analyzed, so that detect the control information part that is also referred to as " audio header " hereinafter, this part comprises the indication such as the details of the communication specification of message; Detect in addition and comprise the efficient message that will be sent to the recipient or the major part of effective information, this part is also referred to as " audio frequency main body " hereinafter.

Term " transmit leg " and " recipient " not necessarily mean the individual consumer, but can refer to a member or all members in user group, the such group.User group can be utilized single shared transmission or receiving system, and for example, the kinsfolk under this device perhaps uses to be the employee in this office of the device of office's appointment.User's group can also refer to such one group of user, and wherein everyone has the device of oneself, and in this case, the message of organizing at this user will be sent to all receiving systems.

The communication specification that is incorporated in the message in the control information part can be any transmission and/or present standard, for example type of message and/or sending mode, and it stipulates that this message is that maintain secrecy, private, urgent or the like.The control information part can also comprise the information that is used for transmitting side marking or specify message recipient.For example, typical audio header can be " private message that Bob causes Carl ".This control information part of audio message is partly translated at least, to be used for the control audio message transfer service to the transmission of special audio message and/or present.For example, the control signal that is used for other parts (such as transmitting-receiving station, router etc.) of dispensing device and/or receiving system and/or audio message transfer system can partly generate based on this control information.

In another step, the major part of audio message is sent to and is positioned near the receiving system of recipient at least, and is presented to the recipient there.

A kind of according to this method be used for audio message is comprised dispensing device and message analysis device from the suitable audio message transfer system that transmit leg sends to the recipient, this dispensing device has the user interface of the audio message that is used for collecting transmit leg, this this audio message of message analysis device analysis is so that detection is about the control information part and the major part that comprises the real messages that will be sent to the recipient of the communication specification of this audio message.In addition, this audio message transfer system comprises translation unit, and it is used at least in part the control information of audio message partly being translated, so that this audio message transfer system is controlled, to be used to transmit specific audio message.In addition, this audio message transfer system comprises the receiving system with user interface, to be used for presenting to the recipient to the major part of major general's audio message.At last, this audio message transfer system need be used for sending to from dispensing device to the major part of major general's audio message the device of receiving system.

Under the help of the method according to this invention and audio message transfer system, the user controls this audio message transfer system by the order that is included in the audio message, thereby has avoided the continuous transmission to his said full content.In other words, the user can the actual audio content with message provide " metamessage (meta-information) " for system in language.This system correspondingly analyzes this audio message, and the audio header that will comprise control information is separated with the audio frequency main body with any language that will be transmitted.If this system can not detect and have suitable indication the audio header of (its show with ad hoc fashion send message to unique individual), just can not be sent out whatever.

This will be explained in the simple case below: the user of supposing the system says " cause the message of Carl: football match is 7 beginnings in the afternoon ", and this language will be picked up by the user interface of dispensing device and be analyzed.Audio header " causes the message of Carl " with detected and translation, and message " football match is 7 beginnings in the afternoon " will be sent to the recipient who is called as " Carl ".On the other hand, if the user tells the time started of the another person in the room about competing by language " Pete; do you know; football match is 7 beginnings in the afternoon " simply, audio message transfer system that then is activated or corresponding dispensing device will wherein not comprise audio header based on the analysis of this statement is inferred.Therefore, this statement just can not be identified as audio message, and can not be sent out away.

Therefore, the invention provides a kind of device of the simple especially and described system of user-friendly control, thereby have only specific statement to be sent to other people, and do not need some parts (for example microphone or loud speaker) of inactive in advance this system or this system by the audio message transfer system.In addition, the user who sends can and present about the transmission of message this system is controlled, and wherein can all control indications be included in the message easily by suitable formulation audio header, carries out any manual operation and need not the user.In other words, the The whole control of audio message transfer system can be utilized hand free set and be carried out comfily.Therefore, this system provides and is better than for example advantage of the normal speech control of the general mobile phone in automatic hand free set, wherein can utilize voice command to start and being connected of control and other participants, still after this between user and participant, keep forever being connected.All statements of user all are transmitted to other participants, and have only by sending appropriate command or just can making phone quiet by covering microphone etc.

Each dependent claims and ensuing explanation have disclosed useful especially embodiment of the present invention and feature.

In a preferred embodiment of the invention, the control information of audio message part also is sent to receiving system at least in part and is translated out, to be used for control audio message is presented to the recipient.In other words when, how, under the help of audio header, receiving system receives suitable information, for example about and to the audio frequency main body of which (which) user output audio message or audio message.Preferably, audio header also can be exported to the recipient at least in part.

Because the control information part preferably relates to the order that the user says, therefore the automatic speech recognition technology can be used to discern the control information part in the audio message, wherein in this case, automatic speech recognition is not strict to mean speech recognition, but means the language understanding technology.For this reason, dispensing device should comprise the automatic speech recognition configuration.

In order to help that the control information in the audio message is partly discerned, this audio message is preferably set up with defined composite construction, and wherein, control information partly is positioned in the ad-hoc location with respect to major part.More preferably, control information partly is positioned in the beginning of audio message, and the back and then is a major part.Pei Zhi advantage is like this, and the control information part is at first detected by speech recognition configuration, and the major part of back only need be cushioned or prepare to be sent out.Yet, the control information part can be positioned in any suitable position (for example end of message) in the message, perhaps the control information part can be distributed in the several position place in the message, make some control information be positioned in the beginning of message, other control information then be positioned in message by centre position or end.

The analysis of the audio message under the help of automatic speech recognizer for example may comprise that search may be stored in some keyword in the suitable memory (such as the memory cell in dispensing device or the receiving system) by the audio message transfer system.The exemplary of these keywords can be " message ", " cause ... message " etc., for the possible recipient's of message the descriptor and the keyword of specify message type or transmission means, such as " maintaining secrecy ", " individual " or " promptly ".

For the transmission that makes message is easy as far as possible, unique identifier string is associated with possible user or user's group of audio message transfer system.Such unique identifier string can comprise for example user's Real Name, perhaps can be any other string of hiding the identity of different user equally.Especially, can utilize the whole user's group of the whole sign of single string.Preferably use can by other users easily in retrospect the pet name or the name of illusion.These pet names are included in the vocabulary of system, and can be used to only just can represent this user very efficiently in audio header by saying its pet name.In addition, can define each group like this, if wherein audio header comprises the title of this group, then all associated members in this group will receive this message.

Preferably, possible recipient's identifier string is stored in the memory of dispensing device with corresponding address book entries, and if necessary, can also be stored in other positions that are fit to of receiving system or audio message transfer system.

Audio message is sent to a lot of people simultaneously through regular meeting.In long dialogue, identical recipient's tabulation will frequently be used.When saying audio header, concerning the user, will be very inconvenient if all must say all names of all recipients at every turn.Therefore, dynamically the pet name or other identifier string are associated and will make that the transmission of message is more comfortable with the tabulation of relative address book clauses and subclauses.

Preferably, be used in audio header, represent that such as " answer " or similar keyword associated audio messages should be sent to the transmit leg of the information that received last time, and may should be sent to all users that last message is sent to.

Described dispensing device preferably is implemented as conversational system, and it comprises the part of such conversational system or such conversational system.In this particularly preferred embodiment, can between audio message transfer system (more preferably being this dispensing device) and transmit leg, start dialogue automatically, with the values of ambiguity (such as based on inner confidence measure) of the recognition result of convenient automatic speech recognizer when meeting or exceeding certain ambiguity threshold level, the control information part of identification audio message.

In other words, if whether system will be sent out, will be sent to whom or should how to be sent out uncertain words for message, this system can send prompting so that request is confirmed to the user, perhaps can enter dialogue with the user so that allow to proofread and correct the audio header of being supposed.In this way, this system guarantees not have message by mistake to be sent or be sent to wrong recipient.

As already mentioned, in a preferred embodiment, control information part also is sent to receiving system at least in part, and this control information part is translated there to be used for the output of control audio message.When also being sent out, this is useful especially when the information (for example identifier string) about recipient's sign.Under the help of this identifier string, before the audio message output of the audio frequency main body of audio message takes place, can discern described user at the receiving system place.

For this reason, in a particularly preferred embodiment, the identifier string of user or user's group is linked to specific user, user's group or user group membership's identifier feature.Described identifier feature can be for example secret character string, speaker identification symbol feature and/or video features (such as suitable user's biometric data).Under the help of these identifier features, can be before the major part of output special audio message, when receiving this message from identifying the authorized recipient of this special audio message in the middle of near this receiving system other may users.

Preferably, described identifier feature can be stored in the addressable memory of receiving system, and this receiving system comprises the device of discerning the recipient based on these identifier features.

A kind of possibility is, the people between the video camera observation cabin, and under the help of biometric data, utilize known image processing techniques identification recipient's face.

Perhaps, this device can this user of acoustics ground identification.For example, can the output audio header, carry out suitable prompting subsequently.If the user answers, and can be right user with this User Recognition by speaker identification then.Have only that this message just is output after this user is by good authentication.

In a preferred embodiment, the transmit leg of audio message can be identified by the identifier feature, and can be sent out with audio message about the corresponding information of this transmit leg.As long as transmit leg has identified oneself the form of (for example with " Bob causes the message of Carl ") in audio header, just may under the help of identifier feature, detect the validity of transmit leg.

Usually, audio message should be exported to authorized recipient at once owing to its topic.Yet, there is this situation, wherein said output may be improper, for example when exporting secret or private message and recipient are not separately in the room or when being busy with other thing and can not receiving this message.This may be because the recipient talks or makes a phone call.Be particular importance under the circumstances, because can not accept audio message this moment.If the user is not in the room or notice not and message is output at once that this message has just been lost so irretrievablely.

In order to address this problem, analyze the current situation of being discerned that is in of recipient automatically according to a kind of method for optimizing of the present invention, according to this situation, this audio message is presented to the recipient with particular form and/or at special time.For example, if the recipient is on the scene and be not busy with attracting the thing (such as telephone talk) of its attentiveness, then the message of incoming call can be play at once.Otherwise this message can be buffered, and when user one entered room or and finishes his thing, this message just was played out.If need to interrupt (such as because the phone incoming call is arranged) in long message, certain that then can be is afterwards reset constantly.

There are the different methods that the current situation of living in of recipient is analyzed automatically.In a preferred embodiment, a kind of gratifying especially receiving system is implemented as conversational system, and this conversational system has the additional capabilities of utilizing video camera or similar device to receive its environment photo.Then, utilize known image processing techniques, recipient's identity and/or present case can be determined.A kind of very simple method of identification recipient and/or analysis present case is to start dialogue automatically between audio message transfer system/receiving system and recipient.For example, this device can " cause the message of Carl " by the output audio header before above-mentioned dialogue, send then prompting " you are ready to receive this message? "If the user answers "Yes", then this message just is presented, otherwise this message just is buffered, and clearly asks this message afterwards up to the user.

As described in, near the dispensing device being positioned at transmit leg, described audio message transfer system also needs to be positioned near the receiving system the actual reception side.

A kind of suitable dispensing device should comprise following assembly at least:

-user interface is used for collecting the audio message of transmit leg;

-message analysis device is used for analyzing this audio message, so that detect the control information part about the communication specification of this audio message, and comprises and will be sent to the major part of specific recipient's efficient message;

-translation unit is used for translating at least in part the control information part of this audio message, and this control information part is being controlled this audio message transfer system aspect the audio message transmission;

-transmission interface is used for sending to receiving system to the major part of this audio message of major general.

A kind of suitable receiving system should comprise following assembly at least:

-receiving interface is used for receiving the audio message that is sent by dispensing device, and this audio message comprises about the control information part of the communication specification of this audio message and the major part that comprises the efficient message that will be sent to specific recipient;

-user interface is used for presenting to this recipient to the major part of this audio message of major general;

-translation unit is used for translating at least in part the control information part of this audio message, and this control information part is controlled this audio message transfer system at audio message aspect presenting.

As having explained in the above, described dispensing device and/or receiving system preferably are implemented as conversational system.Described dispensing device and receiving system can be configured to identical, and can comprise the assembly that is necessary that is used to send and receive message.The conversational system that is used for other purpose (such as other device of control) can be equipped with suitable assembly, thereby makes such conversational system can be used as dispensing device and/or the receiving system according to audio message transfer system of the present invention.

In a particularly preferred embodiment, described dispensing device and receiving system comprise the parts as the conversational system described at DE 102 49 060 A1.In this case, this conversational system only needs further to be equipped with suitable message analysis device, translation unit and transmitter/receiver interface, thereby can come transmission of audio message by communication network.Described message analysis device can be in fact the voice recognition unit that exists in this device, and it is equipped with suitable vocabulary to be used to detect audio header.The translation unit that is used for translating the control information part of audio message preferably may be implemented as the interior software routines of actual dialogue control unit, perhaps may be implemented as the multi-form software on the processor that operates in this conversational system.This translation unit must be converted to control signal with the control indication that is included in the audio header, thereby can this message be sent to recipient's receiving system from the dispensing device of transmit leg according to predetermined way, perhaps can in correct mode the message that receives be presented to correct recipient by this receiving system.

Other purpose of the present invention and feature will become apparent by the detailed description of carrying out below in conjunction with accompanying drawing.Yet, should be appreciated that accompanying drawing is just for the purpose of explaining designs, it is not as the qualification to invention.

Fig. 1 is the schematic diagram according to an embodiment of audio message transfer system of the present invention;

Fig. 2 is the perspective view that is used for according to a preferred embodiment of the transmission of the audio message transfer system of Fig. 1 and/or receiving system;

Fig. 3 shows a simple case that has according to the audio message of structure of the present invention;

Fig. 4 shows in dispensing device and imports the flow chart of beginning up to the handling process of the transmission of audio message from the user.

Fig. 1 shows a kind of audio message transfer system, and for for simplicity, it has two devices, promptly is positioned at transmit leg U _SNear dispensing device 2 _TBe positioned at recipient U _RNear receiving system 2 _R, wherein, dispensing device 2 _TWith receiving system 2 _RBe connected to each other by network N.

Communication network N can be any one network, such as telephone network, mobile telephone network, internet, office's Intranet or domestic communication system.Wherein only need two devices 2 _TWith 2 _RCan intercom mutually by suitable interface 14.

Usually, such audio message transfer system 1 comprises much more device.Any amount of device can be incorporated in wherein.Especially, will certain message only not send to another device from a specific device.Such message can be sent to several means simultaneously, for example message is sent to user's group (being many recipients) from a user.

Shown in example in, dispensing device 2 _TWith receiving system 2 _RUsually make in the same manner, promptly they both can be used to receive audio message and also can be used to send audio message.Reference numeral 2 _TWith 2 _RJust for the sake of clarity distinguish receiving system 2 _RWith dispensing device 2 _TIn general, can also send message in the opposite direction.Therefore, for the simplification problem, suitably also described device is being called " R-T unit " 2 under the situation _T, 2 _R

According to a kind of favourable configuration with such R-T unit 2 _T, 2 _RBe configured to conversational system.

Be not shown in the drawings other assembly with other, this conversational system comprises user interface 10, and this user interface 10 has and is used for picking up or collect configuration such as voice or the audio signal sung by microphone or the like there from the user.This user interface 10 also has acoustics output configuration 12, such as loud speaker.In addition, user interface 10 can comprise the assembly that is used for vision output or input, such as display and/or video camera.

In a preferred embodiment shown in Figure 2, user interface is movably (for example can around the rotation of axle) and be installed on the shell 18, and this shell may comprise R-T unit 2 _T, 2 _RAny other assemblies.User interface 10 has the front 17 that can know identification, and it comprises loud speaker 12, two microphones 11 and video cameras 16.In addition, this embodiment can comprise the display unit (not shown), to be used for the vision output of information.A preferred conversational system with this display unit is the home dialog system of describing in DE 102 49 060 A1, merges it here in full with for referencial use.To explain below by R-T unit 2 _T, 2 _RThis implementation and the additional functional advantage of the present invention that realizes.

R-T unit 2 _T, 2 _ROther assemblies are audio frequency control units 8, it is for example controlled the audio-frequency function of user interface 10 and prepares the voice signal of incoming call for the treatment step of back.An example of the treatment step of described back is automatic speech recognition configuration 7, and it comprises actual voice recognition unit 5, and the back and then is follow-up language understanding unit 6.Under the help of these assemblies, the user U of incoming call _SVoice signal can be according to common mode analyzed and identification, promptly can determine the bottom connotation of oral input.

Voice identification result is forwarded to dialogue control unit 3 subsequently, its control with user's actual dialogue and with use 13 application as information receiving and transmitting here and work so that send or receive audio message.This information receiving and transmitting application 13 is guaranteed and can be sent and receive message with suitable electronic form with the physical network interface 14 that is connected to communication network N.Therefore, information receiving and transmitting uses 13 also can be counted as " receiving interface " or " transmission interface " with network interface 14, perhaps suitably is being counted as " transmitting-receiving interface " under the situation.

Because in order to allow U with the user _S, U _REngaging in the dialogue, to export to the user be necessary, and therefore described system also has prompting maker 9 to be used for generating the output prompting.Such prompting maker 9 can be exported the prompting that generates in advance of fetching from memory, perhaps can comprise the speech production unit, being used for that text prompt is converted to voice signal, described voice signal can be used as synthetic speech by voice controller 8 and user interface 12 and be output.

Send user U _SAudio message can be sent to recipient U in the following manner _R, recipient U _RBe another individual consumer in this embodiment:

Transmit leg U _SSay audio message AM, this message is by R-T unit 2 _T User interface 10 detect, perhaps saying so is more accurately detected by audio detection configuration 11.The voice signal that is write down is then by 8 preliminary treatment of audio frequency control unit, and is forwarded to the kernel of automatic speech recognition unit 5, and this automatic speech recognition unit 5 is with subsequently language understanding unit 6 analysis user U _SStatement.

According to the present invention, the actual information (being so-called major part MP) that such audio message AM comprises control information portion C P (audio header) and will be sent out.This structure is shown in Figure 3.The message here " cause the private message of Carl: meeting is with 7 beginnings in the afternoon " comprises control information portion C P and " causes the private message of Carl ", and the back is major part MP " meeting is with 7 beginnings in the afternoon ".

Automatic speech recognition configuration 7 is configured to identification control message part CP and it is separated with major part MP.For this reason, the vocabulary of automatic speech recognition configuration 7 comprises some control word CW, if described control word occurs in certain sentence structure, then they will be identified as the control information portion C P that belongs to audio message AM.

These control words CW is stored in receiving system 2 _T Memory cell 15 in.In addition, this memory cell 15 is gone back location identifier string IS, such as each user's the pet name that might be this audio message transfer system of possible recipient.Corresponding " buddy list " that comprise potential recipient's the pet name and their address in audio message transfer system 1 can be by dispensing device 2 _TThe user organize.This tabulation can be stored in dispensing device 2 _TIn or be stored in other position of this audio message transfer system 1, for example be stored on service supplier's the server.

In the example shown in the figure, the major part MP of audio message AM and control information portion C P are delivered to dialogue control module 3 from automatic speech recognition configuration 7, and the translation unit 4 that for example has the software routines form is installed in this dialogue control module 3.Control word CW and the identifier string IS of this translation unit 4 in also can reference to storage 15, therefore, it can translate the control information portion C P of audio message AM, is used for audio message transfer system 1 (particularly dispensing device 2 so that generate _T) control signal corresponding, thereby (particularly dispensing device 2 for control audio message transfer service 1 _T).If control information portion C P can not clearly be discerned, then talk with control unit 3 by for example making prompting maker 9 to transmit leg U _SSend suitable prompting and start dialogue, such as " you plan to send private message to Carl? "Under suitable situation, transmit leg U _SCan not answer the control header CP that infers for confirmation or when detecting control header CP mistakenly, stop described program with simple "Yes" or " not being ".

If this system has determined the control header and has been correctly validated, if perhaps the user has confirmed the control header inferred by dialogue subsequently, the major part MP that then is attached to the audio message AM on the audio header CP just is sent to the recipient U by identifier string IS appointment in audio header CP _R, be exactly the user of the pet name in the example in front for " Carl ".

For this reason, dialogue control unit 3 is with major part MP and preferably also have control information portion C P to pass to information receiving and transmitting application 13, and transmit any control signal corresponding simultaneously, thereby can audio message AM be sent to the receiving system 2 of the pet name for the user of " Carl " via network N _RThe address.The major part MP of control information portion C P and audio message AM is sent to receiving system 2 via the network interface 14 that is connected to communication network N subsequently _R

At dispensing device 2 _TThe interior sequence of operation is displayed in the flow chart of Fig. 4.This processing is imported from the user of step I.In Step II, determine by suitable analysis whether this user's input comprises audio header CP, thereby whether the whole required part of following step III inspection audio header exists all and can know identification.Otherwise step IV starts dialogue, promptly asks a question and analyzes answer to the user, all is identified up to the whole required part of audio header.An exemplary of mistake translation may be caused by following message: " cause the private message of Julie: Ann, today, we had lunch together? "This message may be translated into provide that audio header " causes the private message of Julie " and major part " Ann; today, we had lunch together? ", perhaps audio header " cause the private message of Julian " and major part " today, we had lunch together? "In this case, " you want to send private message to Julian? " may be pointed out by system transmit leg U _SCan answer " no, I want to send private message to Julie ".Here, this answer is by specifying first kind may clarify the mistake translation by option.At step V, audio frequency main body (being major part MP) can be separated with audio header CP.Subsequently, in this dialogue, may have further treatment step.In the above example, whether the user is putd question to also have other information to be sent out with audio message AM, promptly whether will send image or video.Other annex equally can accompanying audio message AM, such as document.If the user confirms that then treatment step VII can determine which image or video will be affixed to this message.Another prompting among the step VI can inquire whether also have more picture, video etc. to be added.In case message is finished, step VIII decision sends this message.

At receiving system 2 _RThe place, control information portion C P and the major part MP of audio message AM are received by network interface 14, and use 13 by the information receiving and transmitting in this device and handle.The output of message is carried out by dialogue control unit 3, if necessary can also be by receiving system 2 _R Prompting maker 9, audio frequency control unit 8 and the loud speaker 12 of user interface 10 carry out.

For fear of at predetermined recipient U _RNot in the room or be busy with other thing or export described message, receiving system 2 together the time with other people of the content that should not know this message _RAnalyze described situation in advance.For example, movably user interface (referring to Fig. 2) can rotate, so that scan whole room under the help of video camera 16.Utilize known image processing techniques, can determine predetermined recipient U _RWhether in the room.Be stored in described memory in the help of the identifier feature IC that is associated of different identification symbol string IS under, can discern predetermined recipient U _R

For this reason, follow the identifier string IS of described message by information receiving and transmitting application 13 or receiving system 2 _RSimilarly suitably module use so that from memory 15, fetch corresponding identifier feature IC, and utilize these identifier features IC to discern recipient U _RDescribed identifier feature IC is used in the image processing so that identify recipient U in the middle of in the room other people _RBiometric data.

Similarly, also can use the speaker identification feature.For example, in this example, the dialogue control unit 3 can guarantee to have only audio header CT (" causing the private message of Carl ") via receiving system 2 _RAudio frequency control unit 8 and user interface 10 and being output, be subsequently by prompting maker 9 generate replenish " you think to listen at once message? "When the user who is asked like this answers, the answer that can be said by voice recognition unit 5 and described language understanding element analysis, and check its validity by speaker identification simultaneously, thereby the feature extracted and the information characteristics IC in the memory 15 are compared, and are that right user and authorized recipient answer so that determine whether.

In addition, under the help of video camera 16 and common image processing techniques, can determine whether the user is talking, whether making a phone call or whether be in to make it can not receive other situations of message with other people.

If the recipient is U _RNot in the room or can not receive message AM, then this message is buffered and the moment afterwards is output.If the recipient is U _RShow and be ready to listen to privately this message, then receiving system 2 _RAlso with this audio message of buffer memory AM, and up to recipient U _RAgain in the room or up to the recipient, for example when guaranteeing to listen to privately this audio message AM, modes such as earphone just play this message alone by wearing.

Receiving system 2 _R User interface 10 advantageously its front 17 is turned to authorized recipient by the message of receiving system 2R identification, that is to say, when the major part of output dialogue prompting or audio message AM or audio message AM, this receiving system 2 _RRedirect to directly in the face of recipient U _RExport or utilize receiving system 2 _RPerhaps dispensing device 2 _TOther advantageous manner realize that with the form of conversational system it is described in document DE 102 49 060 A1.

Although the form with preferred embodiment and modification thereof discloses the present invention, should be understood that without departing from the present invention and can make a large amount of additional modification and modification it.Especially, can for example utilize with described different architecture and construct described dispensing device and/or receiving system.

For the sake of clarity, it is also to be understood that " " among the application does not get rid of a plurality of, " comprising ", other steps or element do not got rid of in a speech.Unless be explicitly described as single entity, otherwise " unit " speech can comprise a plurality of or device.

Claims

1, a kind of being used for by the audio message transfer system audio message (AM) from transmit leg (U _S) send to recipient (U _R) method, comprise the steps:

-utilize dispensing device (2 _T) collection transmit leg (U _S) audio message;

-analyze this audio message (AM), to be sent to recipient (U so that detect about the control information part (CP) of the communication specification of this message (AM) and comprise _R) the major part (MP) of efficient message,

Wherein, the control information of this audio message (AM) part (CP) is translated at least in part so that control this audio message transfer system (1), is somebody's turn to do (specific) audio message (AM) to be used for transmission;

-send to receiving system (3) to this major part (MP) of this audio message of major general (AM);

-present to recipient (U to this major part (MP) of this audio message of major general (AM) _R).

2, according to the process of claim 1 wherein, the control information part (CP) of described audio message (AM) is sent to described receiving system (3) at least in part and is translated, and to be used for control this audio message (AM) is presented to recipient (U _R).

3, according to the method for claim 1 or 2, wherein, the control information of described audio message (AM) part (CP) is presented to recipient (U at least in part _R).

4, according to any one the method in the claim 1 to 3, wherein, set up described audio message (AM) according to defined composite construction, in this composite construction, described control information part (CP) is positioned in the specific location with respect to described major part (MP).

5,, wherein, discern control information part (CP) in the described audio message by using the automatic speech recognition technology according to any one the method in the claim 1 to 4.

6, according to the method for claim 5, wherein, if the values of ambiguity of the recognition result of automatic speech recognition configuration (7) reaches or surpasses the specific ambiguity limit, then between described audio message transfer system (1) and transmit leg, start the control information part (CP) that described audio message (AM) is discerned in dialogue automatically.

7, according to any one the method in the claim 1 to 6, wherein, the identifier string (IS) that each is unique and the possible user of described audio message transfer system or user's group are associated, and the control information of described audio message (AM) part (CP) comprises the recipient (U with this audio message (AM) _R) the identifier string (IS) that is associated.

8, according to any one the method in the claim 1 to 7, wherein, the identifier feature (IC) of the different members that the identifier string (IS) of user or user's group and this user or this user group and/or this user are organized is associated.

9, method according to Claim 8 wherein, in the major part that presents described audio message (AM) (MP) before, is discerned the authorized recipient (U of this audio message (AM) based on described identifier feature (IC) _R).

10, according to Claim 8 or 9 method, wherein, discern the transmit leg (U of described audio message (AM) based on described identifier feature (IC) _S).

11, according to any one the method in the claim 1 to 10, wherein, analyze identified recipient (U automatically _R) the present located situation, and with particular form and/or at special time described audio message (AM) is presented to recipient (U according to this situation _R).

12, according to the method for claim 10 or 11, wherein, at described audio message transfer system (1) and recipient (U _R) between start dialogue automatically so that the identification recipient (U _R) and/or analyze the present situation.

13, according to any one the method in the claim 1 to 12, wherein, at least the major part (MP) of described audio message (AM) is presented to the recipient by user interface (10), this user interface (10) comprises can self-orientating front (17), during presenting message, described front is oriented in the face of the recipient.

14, a kind of being used for audio message (AM) from transmit leg (U _S) send to recipient (U _R) audio message transfer system (1), comprising:

-dispensing device (2 _T), it has and is used for collecting transmit leg (U _S) the user interface (10) of audio message (AM);

-message analysis device (7), it is used for analyzing this audio message, will be sent to recipient (U so that detect about the control information part (CP) of the communication specification of this audio message (AM) and comprise _R) the major part (MP) of efficient message;

-translation unit (4) is used for translating at least in part the control information part (CP) of this audio message (AM), so that control this audio message transfer system (1), and should (specific) audio message (AM) to be used for transmitting;

-receiving system (2 _R), it has the major part (MP) that is used for to this audio message of major general (AM) and presents to recipient (U _R) user interface (10);

-be used for major part (MP) to this audio message of major general (AM) from dispensing device (2 _T) send to receiving system (2 _R) device (13,13, N).

15, a kind of dispensing device (2 that is used for according to the audio message transfer system (1) of claim 14 _T), comprising:

-user interface (10) is used to collect transmit leg (U _S) audio message (AM);

-message analysis device (7), it is used for analyzing this audio message, so that detect about the control information part (CP) of the communication specification of this audio message (AM) and comprise and will be sent to recipient (U _R) the major part (MP) of efficient message;

-translation unit (4) is used for translating at least in part the control information part (CP) of this audio message (AM), so that control this audio message transfer system (1), and should (specific) audio message (AM) to be used for transmitting; And

-transmission interface (13,14) is used for sending to receiving system (2 to the major part (MP) of this audio message of major general (AM) _R).

16, a kind of receiving system (2 that is used for according to the audio message transfer system of claim 14 _R), comprising:

-receiving interface (13,14) is used for receiving by dispensing device (2 _T) audio message (AM) that sends, this audio message (AM) comprises about the control information part (CP) of the communication specification of this audio message (AM) and comprises and will be sent to specific recipient (U _R) the major part (MP) of efficient message;

-user interface (10) is used for presenting to the recipient to the major part of this audio message of major general; And

-translation unit (4) is used for translating at least in part the control information part (CP) of this audio message (AM) so that control this audio message transfer system (1), to be used to present this audio message (AM).