TWI440346B - Open architecture based domain dependent real time multi-lingual communication service - Google Patents

Open architecture based domain dependent real time multi-lingual communication service Download PDF

Info

Publication number
TWI440346B
TWI440346B TW098114753A TW98114753A TWI440346B TW I440346 B TWI440346 B TW I440346B TW 098114753 A TW098114753 A TW 098114753A TW 98114753 A TW98114753 A TW 98114753A TW I440346 B TWI440346 B TW I440346B
Authority
TW
Taiwan
Prior art keywords
communication
client
private key
translation
language
Prior art date
Application number
TW098114753A
Other languages
Chinese (zh)
Other versions
TW201006190A (en
Inventor
Sasha Porto Caskey
Danning Jiang
Wen Liu
David Lubensky
Yong Qin
Andrzej Sakrajda
Cheng Wu
Original Assignee
Ibm
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US12/113,567 priority Critical patent/US8270606B2/en
Application filed by Ibm filed Critical Ibm
Publication of TW201006190A publication Critical patent/TW201006190A/en
Application granted granted Critical
Publication of TWI440346B publication Critical patent/TWI440346B/en

Links

Classifications

    • G06F40/58
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to network resources
    • H04L63/104Grouping of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption

Description

Domain-dependent instant multilingual communication service based on open architecture

The present invention relates to multilingual communication and, more particularly, to systems and methods for instant multilingual translation communication.

Increasing economic globalization and the popularity of social networking have led to an increasing number of talks between people using different languages. Participants can be further grouped by the subject of the conversation (domain). The challenge is how to organize this multilingual conversation based on interest groups and find an effective way to host this multilingual conversation on the Internet.

There is currently no effective solution to this problem and there is no service that provides an actual instant multilingual conversation environment. Today's speech and language technologies (automatic speech recognition, machine translation, and text-to-speech) are mature enough to help cross-language conversations in well-defined domains. However, the challenge of having an open structure for organizing such cross-language conversations and making the open architecture available to many people, such as social networking connection groups, cannot be solved separately by such techniques.

Direct communication between the client and the server over the Internet is often not possible due to the presence of firewalls and proxy servers between peers. Therefore, direct client-server links are not a reasonable means of communication under such conditions. Decentralized speech recognition (DSR) solutions based on data streams do not provide control channels, so it is difficult to have the flexibility to dynamically select different languages or domains.

A system and method for instant network communication provides a session identifier as a public key for group communication between clients, and provides a private key representing one of a plurality of clients A channel identifier. The channel identifier includes client-specific attributes that are used to indicate grouping criteria for group communication. A dynamic communication link between the client and the service is established via the network based on the public key and the private key combination to enable group communication based on the attributes of the private key and the public key. By translating a communication using a translation service, the translation service provides response information in a specified language using a property associated with the private key and the public key combination to enable multilingual instant communication.

A system and method for instant multilingual communication includes providing a channel identifier representing a private key of each of a plurality of clients and providing a session identifier as a public key for client communication. A dynamic link between the client and the service is established over the network for communication by using a public key and a private key combination. By translating a communication using a translation service, the translation service provides response information in a specified language using a property associated with the private key and the public key combination to enable multilingual instant communication.

A method for instant multilingual communication provides a channel identifier representing a private key for each of a plurality of clients, wherein the private key includes a choice of language and a manner in which each client receives communications. The session identifier is provided as a public key for the duration of the client communication between the clients attempting to communicate. A dynamic link between the client and the service is established over the network for communication by using a public key and a private key combination. The communication is delivered via the network by using a web service. The communication is translated by using one of the translation services provided by the web service, which provides the response information in a specified language using the attributes associated with the private key and the public key combination to enable multilingual instant communication. The translation of communication and communication is provided to all clients participating in the session according to the choice of the language of each client.

A system for instant multilingual communication includes a client device including a program configured to request a session and generate a channel identifier representing a private key, wherein the private key includes a language selection and each user The way the terminal receives communication. The server is connected to the client via a network and includes a web service configured to provide a session identifier as a public key for a client communication session between the clients attempting communication A dynamic link between the client and the web service is established over the network for communication by using a public key and a private key combination. The web service is configured to deliver communications over the network. The web service includes a translation service for translation communication that provides response information in a specified language for multi-lingual instant communication using attributes associated with the private key and public key combination.

These and other features and advantages will be apparent from the following description of the preferred embodiments of the invention.

The disclosure will provide details in the following description of the preferred embodiments with reference to the following figures.

In accordance with the principles of the present invention, an open architecture based solution is provided for language translation. In one embodiment, the architecture is based on a web service, a software system that supports interoperable interactions via a network (specifically, the Internet), including traversal of the firewall. The open architecture preferably uses a public key (organizer session ID) and a private key (participation session ID) to dynamically connect each participant to the correct interest group (topic/domain group). The architecture supports voice-to-speech, text-to-text and text-to-speech translation systems over the Internet or other networks, such as personal computers (PCs), personal digital assistants (PDAs), mobile phones or the like. Network devices are accessed from all over the world. The open structure of web-based services using public and private keys provides access to instant cross-language conversations to many people via the Internet or other networks.

Embodiments of the invention may take the form of a fully hardware embodiment, a fully software embodiment or an embodiment comprising both a hardware component and a software component. In a preferred embodiment, the invention is implemented in software including, but not limited to, firmware, resident software, microcode, and the like.

Furthermore, the present invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing a program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable medium can be any device that can include, store, communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The media can be electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems (or devices or devices) or media. Examples of computer readable media include semiconductor or solid state memory, magnetic tape, removable computer magnetic disks, random access memory (RAM), read only memory (ROM), hard disk and optical disk. Current examples of optical discs include compact disc-read only memory (CD-ROM), compact disc-read/write (CD-R/W), and DVD.

A data processing system suitable for storing and/or executing code may include at least one processor coupled indirectly to a memory element either directly or via a system bus. The memory component can include a local memory, a mass storage, and a cache memory used during actual execution of the code, and the cache memory provides temporary storage of at least some code to reduce arrogance during execution. The number of times the capacity storage retrieves the code. Input/output or I/O devices (including but not limited to keyboards, displays, indicator devices, etc.) can be coupled to the system either directly or via intervening I/O controllers.

The network adapter can also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices via intervening private or public networks. Data modems, cable modems, and Ethernet cards are just a few of the currently available types of network adapters.

Referring now to the drawings, like reference numerals refer to the same or similar elements, and initially referring to FIG. 1, system/method 10 includes an open architecture for multilingual interaction of the network. Provides a voice-to-speech translation system for internet use. The client 12 communicates with the server via the web service 14. Web service 14 provides standardized access to Internet services or other network services. The service is preferably available through all firewalls and is not limited by the platform operating system and the programming language used by the application. Therefore, the client 12 can easily communicate with the server 14. System 10 avoids transmitting raw voice material to reduce the transmission bit rate between client 12 and server 14. The speech features are retrieved at the client 12 and sent to the server 14 to perform speech recognition and translation 16.

In one example of speech recognition component 16, the corresponding transmission bit rate may be, for example, 41.6 kbps without compression, which is much lower than the transmission bit rate of the voice material. Since no distortion occurs in this program, voice recognition performance is guaranteed. The transmission bit rate can be further reduced to, for example, 4 kbps using various compression algorithms such as vector quantization (VQ) techniques, and speech recognition performance will be slightly affected. These bit rates illustrate the reductions that can be achieved in accordance with this embodiment and should not be construed as limiting.

In the system 10 of the present invention, server 14 resources are saved by only for meaningful signals. This is achieved by adding a voice segmentation component 18 at the client 12. When a speech signal is recorded, segmentation component 18 detects the boundary between speech and silence or noise in the speech stream. Once the speech segment is detected, the relevant feature is retrieved and sent to the server for translation results, and the silence or noise segment is removed by the client 12.

The Decentralized Speech Recognition (DSR) solution is based on the transmission of captured speech features rather than audio files by standard codecs. Since the format of the speech feature is vendor specific, the DSR solution for speech-to-speech translation provides another level of security by using speech feature capture as an encryption method.

By using the DSR method based on the web service 14 for the translation service, it is convenient for the client application to select an appropriate translation domain as needed. The domain selection can be dynamically set as a web service input parameter, and thus it can be a language choice if necessary. The DSR-based web service 14 method thus enables the client 12 to use domain-specific speech-to-speech translation services as needed.

This architecture 10 can be easily extended to a situation where multiple parties participate in a network connection community chat by means of a translation service. For the case of a DSR-based web service model, the chat organizer 20 or 22 sends a unique session ID (public key) to the web service 14 for identifying the call and broadcasting the key to each participant or The key is published by a location accessible to the Internet connection community. Each individual participant 12 establishes a channel ID (private key) by appending an attribute such as language, domain, location, user ID to the public key. The participant (12) can send the request and private key to the web service 14. The master translation service 16 will select such individual parties based on the public key to allow for large call groups. Next, the translation service 16 classifies the individual participants into small groups based on key elements within the private key. For example, Chinese speakers will only be in a group, and depending on the field of interest (area of interest), they are further divided into small groups, such as "China Beijing Tourism" and "China Shanghai Tourism".

The web service 14 with the translation function 16 will act as a smart routing agent to organize this multi-lingual chat in different domains or groups 20, 22. The destination of the translated intonation is dynamically determined by the attributes of the original request and the content of the intonation, such as language and domain. For example, English speaking participants have problems with people living on the East Coast of China, and the web service host 14 with the translation service 16 will eventually send the translated intonation to the small group with the best match of language and domain. Therefore, this architecture 10 is an open architecture. This open architecture makes it possible to apply this solution to many people via the Internet.

In an illustrative example, many members or clients 12 of the network connection community wish to participate in multi-lingual chats on different topics via voice, text, or both. The multimodal input is expected to be presented to it in the language selected by each participant. The chat organizer 20 or 22 establishes a unique session ID (public key) and submits it to the web service 14 for registering the call and publicly announcing the key via the network connection community. The individual user client 12 downloads the client software as necessary, including a DSR front end and a text-to-speech (TTS) synthesizer.

Individual participants 12 establish individual channel IDs (private keys) by appending all tags (such as source language, target language, domain, location, action ID) to the public key. Participant 12 can then send the web service request with its private key and affiliate body (voice in the text or voice feature) to the master web service 14. Web service 14 may include a Simple Object Access Protocol (SOAP) (XML Protocol) via Hypertext Transfer Protocol (HTTP). Web service requests can use standard HTTP, so they can pass through the firewall.

The master web service 14 with the translation function 16 acts as a smart multilingual routing agent to dynamically pass requests to the correct chat group and distribute the input to all registered clients in the correct language within the selected chat group. In the illustration 32, the web service 14 includes a routing table 34 that ensures that the client 12 receives the appropriate language translation based on the appropriate chat group 20. In FIG. 36, web service 14 includes a routing table 38 that ensures that client 12 receives the appropriate language translation in accordance with appropriate chat group 22.

The master translation service 16 will select such individual clients 12 based on the public key to maintain a large call group. The translation service classifies individual participants into small groups based on key tags within the private key. For example, I hope that speakers for Chinese will only be in a large group (the target language is Chinese). Depending on the field of view (subject), these users are further divided into smaller groups as needed. The destination of the translated intonation will be dynamically determined by the mark in the original request.

Referring to Figure 2, a cross-language chat between two client terminals 12 (designated as client X and client Y) is illustratively shown. The public key and private key combination is used to establish a dynamic link between the client 12 and the service 110. The context of the submitted request and the filter to be applied to the data passed to the client 12 are defined entirely by such key combination. The attributes associated with the key combination define the filter to be applied to the data passed back to the client. In other words, the private key is generated in such a way that the client can receive a translation of the selected language or a specific type of communication. This is useful for providing content security or age or category specific screening (such as for certain communications that are not suitable for children, etc.).

The web service 14 acts as a smart routing agent and is responsible for distributing the message load. All clients 12 subscribe to a particular topic/domain by polling the available material, data source or information associated with joining the session group in polling mode 112 by voice, text or video. Dynamic access to the decentralized service 110 is provided for any device (e.g., PC, PDA, mobile phone, etc.) that has a network.

The network activities that can be driven by this dynamic key combination will be in a wide range, such as text or voice translation, cross-language imagery and video sharing, and cross-language internet competition. Service 110 includes, inter alia, decentralized speech recognition (DSR) 104, machine translation (MT) 106, and text-to-speech (TTS) 108.

The DSR module 104 receives the captured speech features, such as spectral features, transmitted by the standard codec rather than the audio file. Since the format of the speech feature is vendor specific, the DSR module 104 for speech-to-speech translation provides another level of security by using speech feature capture as an encryption method. The DSR module 104 provides translation services and it is convenient for the client application to select the appropriate translation domain as needed. The domain selection can be dynamically set as a web service input parameter, and thus it can include a choice of language. The DSR-based web service 14 enables the client 12 to use domain-specific speech-to-speech translation services as needed.

To further illustrate the advantages of the present invention, illustrative examples will be presented. A method based on dynamic key combination can be used for cross-language personal ID checking on the Internet. Each personal ID includes a private key, and the requirement for a particular group/domain is a public key. This can be used in social networks to check personal IDs across languages and provide security (for example) to protect groups of teenagers and children.

In a cross-language network conference call scenario, the participant (client 12) can speak in the first language and the server 110 can use the second language (based on the public key). Each request 122 can present all of the information via its private key, and the server 110 can translate the translated text (voice, text, video) of, for example, text-to-speech voice 120 in the appropriate language (eg, as by the user) The first language of choice) is spread across individual participants. In this manner, each client 12 can speak in its native language and receive responses from other participants in the native language of the participant, even if the participants speak in other languages.

Referring to FIG. 3, a further detail of an exemplary embodiment of a streaming mode is described in which information is streamed via a network, for example, by using Voice over Internet Protocol (VoIP) telephony. Three IDs are used: ID_US for User X and Y(12), ID_China, and ID_RTTS for Instant Translation Server (RTTS) 310. Clients X and Y can use a telephone interface using, for example, a VoIP interface. Both ID_China and ID_US have installed plugin 302 that allows access to the provided web services and enables the use of multilingual communication by allowing the collection of speech features.

Users in the United States (User Side X) wish to speak to users in China (User Side Y). Assume that it has the required ID in its individual contact list. The client X or ID_US selects the client Y or ID_China and presses the "call button", which sends the request to the ID_China by using the chat application interface (API) 303. ID_China presses its "Accept Call" button to indicate that it is ready. After the ID_US receives the response from ID_China, the ID_US sends the request to the RTTS web service 318 for call scheduling. The RTTS web service 318 will generate a channel ID with a language tag such as the number .001 (English) and the number .002 (Chinese) ("Number" can be a phone number). These two numbers are passed back to ID_US, and these two numbers are passed to the Session Manager (DM) 330.

ID_US passes the number .002 (Chinese) to the ID_China via the chat API 303. Both ID_US and ID_China start calling RTTS server 310 by using the individually assigned channel ID number .001 and number .002. The RTTS Session Initiation Protocol (SIP) endpoint program 328 will handle these two incoming calls separately based on the given channel ID with the language tag. This example shows an internet protocol connection 312 and uses a session start agreement (SIP) signaling proxy server 306 and a fast transport protocol (RTP) proxy server 308 that includes an encoder/decoder (codec) 307. Server 310 also illustratively includes SIP proxy server 314 and RTP proxy server 316 to provide an appropriate communication protocol between client 12 and server 310. Other network agreements and hardware are also expected. This embodiment should not be construed as being limited by the configuration shown.

After establishing two calls, a push-to-talk (P&T) button on both clients 12 can indicate a call ready state. The P&T button can be generated as part of the plug-in and can be generated with any indicator on the computer screen or such buttons and indicators can be provided on the telephone device. ID_US presses the P&T button and sends the audio stream to the RTTS 310. In one example, the audio stream is encapsulated at the beginning and end by a dual tone multi-frequency (DTMF) key. The audio stream can be buffered in the audio buffer 320 when received.

The request is also sent to the RTTS web service 318 to await the textual result. The RTTS 310 can reproduce incoming audio to a channel connected to ID_China. The RTTS Session Manager (DM) 330 sends this incoming audio stream to the translation service module 340. The module 340 can include an automatic speech recognition (ASR) annotator 322, an instant translation (RTT) annotator 324, and a text-to-speech (TTS) annotator 326. The DM 330 retrieves the identification result and the translation result in the form of text returned from the aggregator 332 as long as it is available from the message prompt 334. The DM 330 sends the message back to ID_US. ID_US displays the results in its chat window and simultaneously sends these results to ID_China for display. A confirmation of these results can be used to ensure that the message is received. The translated TTS Voice Ready DM 330 can pass voice to ID_China via RTP based on the channel ID. ID_China can then press the P&T button and the conversation can continue.

Referring to Figure 4, further details of an exemplary embodiment of Figure 2 with respect to a web service mode are described. The client ID includes ID_US and ID_China. Both ID_China and ID_US have installed plugins 404 that provide the required functionality to perform interface connection tasks, generate indicators, and the like.

Users in the United States (client 12) want to talk to users in China. Both users have the required ID in their individual contact list. ID_US selects ID_China and presses the P&T button. The voice of ID_US is transformed by the feature capture module 402 to the spectrum/features, and the features are preferably transmitted to the RTTS server 420 via SOAP/HTTP. ID_US will send a start signal to ID_China. ID_China sends a "Get Result" request to the RTTS server 420 via SOAP/HTTP. The RTTS server 420 includes a web service 14 that provides a translation service module 440 that performs recognition, translation, and TTS.

The module 440 can include a decentralized speech recognition (DSR) annotator 420, a text-to-speech (TTS) annotator 418, and a real-time interpreter (RTT) annotator 416. The DM 406 retrieves the recognition result and the translation result in text form returned from the aggregator 412 as long as it is available from the message prompt 414. The DM 406 manages the dialogue between the participants and responds to the ID_US in the form of a textual recognition result and a translated result. ID_US displays the results in its chat window, and at the same time the translated results can be sent to ID_China for display using, for example, an instant messaging (IM) API 407. A confirmation can be used to ensure that the message is received. In this example, RTTS server 440 responds to ID_China with TTS, and plugin 404 reproduces this TTS to ID_China.

Although a preferred embodiment of the domain-dependent, immediate multi-language communication service based on the open architecture has been described (which is intended to be illustrative and not limiting), it should be noted that modifications and variations are possible in light of the above teachings. Therefore, it is to be understood that changes may be made within the scope and spirit of the invention as set forth in the appended claims. The content claimed and claimed in the patent certificate is set forth in the appended claims.

10. . . system structure

12. . . Client/participant

14. . . Web service/server/web service host/master web service

16. . . Speech recognition component / master translation service / translation function

18. . . Voice segmentation component

20, 22. . . Chat organizer

32. . . illustration

34. . . Routing table

36. . . illustration

38. . . Routing table

104. . . Decentralized speech recognition (DSR)/DSR module

106. . . Machine translation (MT)

108. . . Text to speech (TTS)

110. . . Server / Decentralized Service / Service

112. . . Polling mode

120. . . Text-to-speech voice

122. . . request

302. . . Plugin

303. . . Chat application interface (API)

306. . . Session Start Protocol (SIP) signaling proxy server

308. . . Fast Transport Protocol (RTP) proxy server

312. . . Internet protocol connection

314. . . SIP proxy server

316. . . RTP proxy server

320. . . Audio buffer

322. . . Automatic Speech Recognition (ASR) Annotator

324. . . Instant Translation (RTT) Annotator

326. . . Text-to-speech (TTS) annotator

328. . . RTTS Session Start Protocol (SIP) Endpoint Program

330. . . Dialogue Manager (DM)

332. . . Aggregator

334. . . Message prompt

340. . . Translation service module

402. . . Feature capture module

404. . . Plugin

406. . . DM

407. . . Instant Messaging (IM) API

412. . . Aggregator

414. . . Message prompt

416. . . Instant Translator (RTT) Annotator

418. . . Text-to-speech (TTS) annotator

420. . . RTTS Server/Distributed Speech Recognition (DSR) Annotator

440. . . Translation Service Module / RTTS Server

1 is a block/flow diagram showing a system/method having a live chat group of participants communicating in a plurality of different languages using a translated web service in accordance with the principles of the present invention; FIG. 2 is a diagram showing a The block/flow diagram of a system/method using instant communication between two clients in a different language using a translated web service; Figure 3 is a diagram showing the flow of data in different languages in accordance with the principles of the present invention. A block/flow diagram of a system/method of more detail of FIG. 2 for instant communication between two clients; and FIG. 4 is a diagram of two clients in different languages in a web service mode in accordance with the principles of the present invention; A block/flow diagram of the system/method of instant communication between.

10. . . system structure

12. . . Client/participant

14. . . Web service/server/web service host/master web service

16. . . Speech recognition component / master translation service / translation function

18. . . Voice segmentation component

20, 22. . . Chat organizer

32. . . illustration

34. . . Routing table

36. . . illustration

38. . . Routing table

Claims (24)

  1. A method for instant network communication, comprising: providing a session identifier as one of a group communication for communication between users; providing one of each of a plurality of clients a channel identifier of the private key, the channel identifier including a client-specific attribute, the attribute is used to indicate a grouping criterion of the group communication; and the establishment is based on the public key and the private key combination via a network a dynamic communication link between the client and a service to enable group communication based on the attributes of the private key and the public key; and using a translation service to translate the communication, the translation service is used The attributes associated with the private key and the public key combination provide response information in a specified language to enable multilingual instant communication.
  2. The method of claim 1, wherein the translating communication comprises translating at least one of a voice, a text, and a video.
  3. The method of claim 1, wherein the response information for a user terminal comprises one of voice, text and video according to the selection information provided in the private key.
  4. The method of claim 1, wherein the translation service comprises at least one of distributed speech recognition, automatic speech recognition, instant translation, machine translation, and text-to-speech synthesis.
  5. The method of claim 1, further comprising: extracting features from a voice intonation of a user; and transmitting the acoustic characteristics of the intonations to the web service.
  6. The method of claim 1, wherein providing a session identifier comprises providing a session identifier for at least one of a chat group, a conference call, and a telephone call.
  7. The method of claim 1, wherein providing a channel identifier comprises: appending the attributes including one of a language, a domain, a location, and a user ID to the public key.
  8. A computer readable medium comprising a computer readable program for instant multilingual communication, wherein the computer readable program, when executed on a computer, causes the computer to perform the following steps: providing a session identifier for use in the user One of the group communication between the ends discloses a key; a channel identifier representing a private key of each of the plurality of clients, the channel identifier including a client-specific attribute, the attributes being used a grouping criterion indicating the group communication; establishing a dynamic communication link between a client and a service via a network based on the public key and the private key combination, so that the private key and the Implementing group communication by the attributes of the public key; and translating the communication using a translation service that provides the response in a specified language using the attributes associated with the private key and the public key combination Information for instant messaging in multiple languages.
  9. A method for instant multilingual communication, comprising: providing a session identifier as one of a public communication key for a user communication session between client terminals attempting communication; providing representation of a plurality of clients One of the private key channel identifiers, wherein the private key includes one of a language selection and one of each client receiving communication; using the public key and the private key combination via one The network establishes a dynamic link between a client and a service for communication; uses a web service to deliver communications over the network; and uses one of the translation services provided by the web service to translate the communication, The translation service provides response information in a specified language for real-time multi-lingual communication using attributes associated with the private key and the public key combination; and based on the selection of the language of each client, Communication and translation of such communications are provided to all clients participating in the session.
  10. The method of claim 9, wherein providing a session identifier comprises providing a session identifier for at least one of a chat group, a conference call, and a telephone call.
  11. The method of claim 9, wherein the translating communication comprises translating at least one of a voice, a text, and a video.
  12. The method of claim 9, wherein the response information for a client includes one of voice, text, and video based on the selection provided in the privacy key.
  13. The method of claim 9, wherein the translation service comprises at least one of decentralized speech recognition, automatic speech recognition, instant translation, machine translation, and text-to-speech synthesis.
  14. The method of claim 9, further comprising: extracting features from a voice tone of a user terminal; and transmitting the acoustic characteristics of the intonations to the web service.
  15. The method of claim 9, wherein providing a channel identifier comprises appending the attributes including one of a language, a domain, a location, and a user ID to the public key.
  16. A computer readable medium comprising a computer readable program for instant multilingual communication, wherein the computer readable program, when executed on a computer, causes the computer to provide a session identifier for use in an attempted communication user One of the client communication sessions between the ends of the public key; providing a channel identifier representing one of the plurality of clients, wherein the private key includes one of a language selection and One way for each client to receive communication; using the public key and the private key combination to establish a dynamic link between a client and a service via a network for communication; using a web service Transmitting communication via the network; translating communications using one of the translation services provided by the web service, the translation service providing response information in a specified language using attributes associated with the private key and the public key combination To enable multi-lingual instant messaging; and to provide translations of the communications and the communications to all of the clients participating in the session based on the selection of the language for each client.
  17. A system for instant multilingual communication, comprising: a client device, comprising a program configured to request a session and generate a channel identifier representing a private key, wherein the private key The key includes one of a language selection and one of each client receiving communication; a server connected to the client via a network and including a web service configured to provide a session identification As a public key for one of the client communication sessions between the clients attempting communication, such that a client is established via the network by using the public key and the private key combination a dynamic link with the web service for communication, the web service configured to deliver communications via the network; and the web service includes one of translation services for translating communications, the translation service using The private key and the attribute associated with the public key combination provide response information in a specified language to enable multilingual instant communication.
  18. The system of claim 17, wherein the server comprises a dialog manager configured to manage the communications between the clients such that the communications to all of the clients participating in the session and The translation of such communications is based on the selection of the language for each client.
  19. The system of claim 17, wherein the privacy key and the public key combination define a context for submitting a request and a filter to be applied to the data delivered to the client.
  20. The system of claim 17, wherein the attributes associated with the key combination define a filter to be applied to the data passed back to the client.
  21. The system of claim 17, wherein the web service acts as a smart routing agent and is responsible for distributing the message load.
  22. The system of claim 17, wherein the users in a session subscribe to a particular subject/domain by polling for information communicated by at least one of voice, text, and video.
  23. The system of claim 17, wherein the session comprises a cross-language network conference call in which at least two participants speak in different languages.
  24. The system of claim 23, wherein a client requests to present all information via the private key, and the server has a common language associated with the public key and the translated message is individually individual participants Language is scattered to these participants.
TW098114753A 2008-05-01 2009-05-04 Open architecture based domain dependent real time multi-lingual communication service TWI440346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/113,567 US8270606B2 (en) 2008-05-01 2008-05-01 Open architecture based domain dependent real time multi-lingual communication service

Publications (2)

Publication Number Publication Date
TW201006190A TW201006190A (en) 2010-02-01
TWI440346B true TWI440346B (en) 2014-06-01

Family

ID=41255651

Family Applications (1)

Application Number Title Priority Date Filing Date
TW098114753A TWI440346B (en) 2008-05-01 2009-05-04 Open architecture based domain dependent real time multi-lingual communication service

Country Status (8)

Country Link
US (1) US8270606B2 (en)
EP (1) EP2274870B1 (en)
JP (1) JP5536756B2 (en)
KR (1) KR101442312B1 (en)
CN (1) CN102017513B (en)
CA (1) CA2717504C (en)
TW (1) TWI440346B (en)
WO (1) WO2009134535A2 (en)

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9128926B2 (en) * 2006-10-26 2015-09-08 Facebook, Inc. Simultaneous translation of open domain lectures and speeches
US8972268B2 (en) 2008-04-15 2015-03-03 Facebook, Inc. Enhanced speech-to-speech translation system and methods for adding a new word
US20100198582A1 (en) * 2009-02-02 2010-08-05 Gregory Walker Johnson Verbal command laptop computer and software
US8060586B2 (en) * 2009-02-03 2011-11-15 Microsoft Corporation Dynamic web service deployment and integration
US9286037B2 (en) 2010-12-29 2016-03-15 Microsoft Technology Licensing, Llc Platform for distributed applications
CN102546710B (en) * 2010-12-29 2015-07-15 上海博泰悦臻电子设备制造有限公司 Method, system and server for logging in chat groups based on mobile terminal
US9164988B2 (en) * 2011-01-14 2015-10-20 Lionbridge Technologies, Inc. Methods and systems for the dynamic creation of a translated website
US8538742B2 (en) * 2011-05-20 2013-09-17 Google Inc. Feed translation for a social network
US8175244B1 (en) 2011-07-22 2012-05-08 Frankel David P Method and system for tele-conferencing with simultaneous interpretation and automatic floor control
US20130210394A1 (en) * 2012-02-14 2013-08-15 Keyona Juliano Stokes 1800 number that connects to the internet and mobile devises
US8849666B2 (en) * 2012-02-23 2014-09-30 International Business Machines Corporation Conference call service with speech processing for heavily accented speakers
US9569274B2 (en) 2012-10-16 2017-02-14 Microsoft Technology Licensing, Llc Distributed application optimization using service groups
US9031827B2 (en) 2012-11-30 2015-05-12 Zip DX LLC Multi-lingual conference bridge with cues and method of use
TW201430593A (en) * 2013-01-25 2014-08-01 Hon Hai Prec Ind Co Ltd System and method for converting multi-language webpage
US8996352B2 (en) 2013-02-08 2015-03-31 Machine Zone, Inc. Systems and methods for correcting translations in multi-user multi-lingual communications
US8996355B2 (en) * 2013-02-08 2015-03-31 Machine Zone, Inc. Systems and methods for reviewing histories of text messages from multi-user multi-lingual communications
US9600473B2 (en) 2013-02-08 2017-03-21 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US9298703B2 (en) 2013-02-08 2016-03-29 Machine Zone, Inc. Systems and methods for incentivizing user feedback for translation processing
US9031829B2 (en) 2013-02-08 2015-05-12 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US9231898B2 (en) 2013-02-08 2016-01-05 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US8990068B2 (en) 2013-02-08 2015-03-24 Machine Zone, Inc. Systems and methods for multi-user multi-lingual communications
US9728202B2 (en) 2013-08-07 2017-08-08 Vonage America Inc. Method and apparatus for voice modification during a call
US9299358B2 (en) * 2013-08-07 2016-03-29 Vonage America Inc. Method and apparatus for voice modification during a call
KR101834546B1 (en) * 2013-08-28 2018-04-13 한국전자통신연구원 Terminal and handsfree device for servicing handsfree automatic interpretation, and method thereof
CN110378113A (en) * 2013-10-28 2019-10-25 日本电气株式会社 Mobile communication system, network node, user equipment and its method
US10199035B2 (en) 2013-11-22 2019-02-05 Nuance Communications, Inc. Multi-channel speech recognition
US10162811B2 (en) 2014-10-17 2018-12-25 Mz Ip Holdings, Llc Systems and methods for language detection
US9372848B2 (en) 2014-10-17 2016-06-21 Machine Zone, Inc. Systems and methods for language detection
US9389928B1 (en) 2015-02-11 2016-07-12 Microsoft Technology Licensing, Llc Platform for extension interaction with applications
US10133613B2 (en) * 2015-05-14 2018-11-20 Microsoft Technology Licensing, Llc Digital assistant extensibility to third party applications
US9569736B1 (en) * 2015-09-16 2017-02-14 Siemens Healthcare Gmbh Intelligent medical image landmark detection
US20160014059A1 (en) * 2015-09-30 2016-01-14 Yogesh Chunilal Rathod Presenting one or more types of interface(s) or media to calling and/or called user while acceptance of call
US9997173B2 (en) * 2016-03-14 2018-06-12 Apple Inc. System and method for performing automatic gain control using an accelerometer in a headset
KR101672300B1 (en) * 2016-04-18 2016-11-03 주식회사 앰버스 Chatting method and chatting system for learning language
US20170365249A1 (en) * 2016-06-21 2017-12-21 Apple Inc. System and method of performing automatic speech recognition using end-pointing markers generated using accelerometer-based voice activity detector

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5715466A (en) * 1995-02-14 1998-02-03 Compuserve Incorporated System for parallel foreign language communication over a computer network
US5987401A (en) * 1995-12-08 1999-11-16 Apple Computer, Inc. Language translation for real-time text-based conversations
US6424992B2 (en) * 1996-12-23 2002-07-23 International Business Machines Corporation Affinity-based router and routing method
US7047416B2 (en) * 1998-11-09 2006-05-16 First Data Corporation Account-based digital signature (ABDS) system
US20020029350A1 (en) * 2000-02-11 2002-03-07 Cooper Robin Ross Web based human services conferencing network
JP2001325202A (en) * 2000-05-12 2001-11-22 Sega Corp Conversation method in virtual space and system therefor
US20020026757A1 (en) * 2000-05-15 2002-03-07 Scissom James D. Access floor system
US7792676B2 (en) * 2000-10-25 2010-09-07 Robert Glenn Klinefelter System, method, and apparatus for providing interpretive communication on a network
US7035804B2 (en) * 2001-04-26 2006-04-25 Stenograph, L.L.C. Systems and methods for automated audio transcription, translation, and transfer
US9626667B2 (en) * 2005-10-18 2017-04-18 Intertrust Technologies Corporation Digital rights management engine systems and methods
JP4299320B2 (en) * 2006-06-06 2009-07-22 株式会社エヌ・ティ・ティ・ドコモ Group communication server
US20080004880A1 (en) * 2006-06-15 2008-01-03 Microsoft Corporation Personalized speech services across a network
US20080300852A1 (en) * 2007-05-30 2008-12-04 David Johnson Multi-Lingual Conference Call
US8220040B2 (en) * 2008-01-08 2012-07-10 International Business Machines Corporation Verifying that group membership requirements are met by users

Also Published As

Publication number Publication date
WO2009134535A2 (en) 2009-11-05
US8270606B2 (en) 2012-09-18
US20090274299A1 (en) 2009-11-05
KR101442312B1 (en) 2014-11-03
EP2274870B1 (en) 2016-09-07
CN102017513A (en) 2011-04-13
WO2009134535A3 (en) 2010-01-07
EP2274870A2 (en) 2011-01-19
CN102017513B (en) 2013-05-22
CA2717504A1 (en) 2009-11-05
KR20110008211A (en) 2011-01-26
CA2717504C (en) 2017-09-19
JP5536756B2 (en) 2014-07-02
JP2011520353A (en) 2011-07-14
TW201006190A (en) 2010-02-01
EP2274870A4 (en) 2015-08-05

Similar Documents

Publication Publication Date Title
CN103250410B (en) In video conference with the novel interaction systems and method of participant
US7154864B2 (en) Method and apparatus for providing conference call announcement using SIP signalling in a communication system
US7130403B2 (en) System and method for enhanced multimedia conference collaboration
KR101252609B1 (en) Push-type telecommunications accompanied by a telephone call
JP2007014006A (en) Monitoring method of data packet transmitted across computer network
US7756923B2 (en) System and method for intelligent multimedia conference collaboration summarization
US7313593B1 (en) Method and apparatus for providing full duplex and multipoint IP audio streaming
KR101652122B1 (en) Real-time voip communications using n-way selective language processing
US20080270558A1 (en) Method, system and device for realizing group-sending message service
US20070155346A1 (en) Transcoding method in a mobile communications system
US8326596B2 (en) Method and apparatus for translating speech during a call
CN102771082B (en) There is the communication session between the equipment of mixed and interface
DE60038516T2 (en) Method and system for bandwidth reduction of multimedia conferences
US9210266B2 (en) System and method for web-based real time communication with optimized transcoding
US20050007965A1 (en) Conferencing system
US20080247523A1 (en) Multiple virtual telephones sharing a single physical address
US10165224B2 (en) Communication collaboration
US20080275701A1 (en) System and method for retrieving data based on topics of conversation
KR20080085841A (en) Method for converting between unicast sessions and a multicast session
US7505574B2 (en) Method and system for providing an improved communications channel for telephone conference initiation and management
JP5384349B2 (en) Method and apparatus for dynamic streaming storage configuration
US20070133773A1 (en) Composite services delivery
US9729395B2 (en) Collaborative conference experience improvement
US20040230659A1 (en) Systems and methods of media messaging
Koskelainen et al. A SIP-based conference control framework