CN102262624A - System and method for realizing cross-language communication based on multi-mode assistance - Google Patents
System and method for realizing cross-language communication based on multi-mode assistance Download PDFInfo
- Publication number
- CN102262624A CN102262624A CN201110225342XA CN201110225342A CN102262624A CN 102262624 A CN102262624 A CN 102262624A CN 201110225342X A CN201110225342X A CN 201110225342XA CN 201110225342 A CN201110225342 A CN 201110225342A CN 102262624 A CN102262624 A CN 102262624A
- Authority
- CN
- China
- Prior art keywords
- user
- chat
- content
- text
- talk
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004891 communication Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000002452 interceptive effect Effects 0.000 claims abstract description 25
- 238000003058 natural language processing Methods 0.000 claims abstract description 9
- 238000013519 translation Methods 0.000 claims description 38
- 238000013523 data management Methods 0.000 claims description 12
- 238000005516 engineering process Methods 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000013459 approach Methods 0.000 claims description 4
- 230000031068 symbiosis, encompassing mutualism through parasitism Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 2
- 235000013550 pizza Nutrition 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 5
- 235000013399 edible fruits Nutrition 0.000 description 5
- 230000006870 function Effects 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 241000234295 Musa Species 0.000 description 2
- 235000018290 Musa x paradisiaca Nutrition 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 235000013361 beverage Nutrition 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 235000013372 meat Nutrition 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Abstract
The invention provides a system and a method for realizing a cross-language communication based on a multi-mode assistance. The method provided by the invention comprises the following steps of: utilizing a foreground interactive module, a data managing module and a semantic association module of the system for realizing cross-language communication; utilizing a natural language processing tool to automatically extract a central discussion subject and keywords in conversations through analyzing conversation contents; utilizing the semantic association module to automatically search for relative images and video clips and supplying the relative images and video clips to both sides of the conversation by an appropriate way to achieve the aims of accelerating mutual understanding and communication. The images and videos which are used as assisted understanding can be automatically picked from the network through a searching method and also can be directly obtained from a pre-labeled multimedia library. Finally, the system provided by the invention can generate a multi-mode conversation abstract according text chat messages of both sides of the conversation and corresponding images and video contents.
Description
Technical field
The invention belongs to multimedia analysis, field of network communication, relate to the method for striding language communication based on multi-modal auxiliary realization.
Background technology
Along with the fast development of mechanics of communication and Internet technology, occurred and the diverse a kind of network instant communication systems of traditional communication modes such as mail, phone, telegram, such as MSN and QQ.Traditional mail and telegram are based on literal, and phone is based on voice, and instant messaging not only can be used literal and voice, can also assist multimedia meanses such as rich video, picture.By instant communicating system, the people that are separated by vast oceans can realize as aspectant live talk.The whole earth has become genuine global village.
For the interlocutor who says different language, language issues remains and is difficult to the obstacle gone beyond in the instant messaging.In recent years, because machine translation mothod has been obtained rapid progress, the language issues that the interchange of the user between the different language exists has obtained certain solution by the technology of mechanical translation to a certain extent.But there are two significant disadvantages in mechanical translation.First is exactly the accurate translation between the different language.But mechanical translation still can only be translated automatically to some simple dialogues.Even the maximum bilingual of number of users in the world: English and Chinese, the automatic translation accuracy rate between them also still can't satisfy daily use needs fully.If consider numerous in the world minority languages, automatic accurately translation may remain a problem of shouldering heavy responsibilities between the different language.Second be exactly the polysemy of the meaning of a word be another the challenging difficult problem that runs in the mechanical translation.
For strengthening the synthesis system from the text to the image that exchanges, in the prior art form of body matter in the text of input with picture showed.The solution of this problem is to finish conversion from the text to the picture by three optimizations, promptly the probability that occurs based on the text maximization key word of input, maximize probability that corresponding picture occurs and based on input text based on input text and the key word selected, selected key word and corresponding picture maximization text and the space distribution of picture.Finally finish conversion from the text to the picture based on these three optimizations like this.But there are following three shortcomings in this system:
1). system handles speed is slow.This system can cause picture slack-off to the conversion rate of text owing to want calculation optimization like this;
2). the interface of system is unfriendly.Draw space layout and present to the user again owing to will be optimized together the text of input and the picture that provides.If the layout that such text picture is mixed is applied to the situation of talking with between the user, will certainly cause disagreeableness sensation to the user.
3). system is difficult for using.Owing to be terminal software, certainly will require user's downloaded software voluntarily like this.Can solve the not wield shortcoming of system by webpage.
Summary of the invention
The objective of the invention is to solve the prior art processing speed slowly, not wield technological deficiency, by the auxiliary people's online exchange swimmingly of using different language of multi-modal information.Reduce ambiguousness and the polysemy that produces in the automatic translation of tradition by multi-modal information such as image, video, and auxiliary semantic understanding to the user session content, the invention provides a kind of method of striding language communication based on multi-modal auxiliary realization thus.
For realizing described purpose, it is a kind of based on the multi-modal auxiliary language communication system of striding that a first aspect of the present invention provides, and the technical scheme of this system comprises: foreground interactive module, data management module and semantic association module, wherein:
The input end of foreground interactive module is accepted the text chat content of user's input and user's chatting contents is carried out pre-service, obtain the text message of user chat, and the output terminal of the AM/BAM interactive module by the foreground interactive module transmits the user version chat content after handling; The chat page of foreground interactive module chat for the user shows both sides' word content of dialogue and the multimedia picture of recommending out according to the content system that both sides talk;
The input end of semantic association module is connected with foreground interactive module output terminal, receive and user's text chat content is analyzed, utilize the natural language processing instrument to extract the main contents that both sides talk, obtain and export text and the corresponding multimedia messages of translating in the text message association, and generate a multi-modal summary according to the content and the corresponding multimedia messages of text chat content, translation;
The input end of data management module is connected output terminal and connects with the semantic association module, data management module will be stored the text chat content of new input, the content and the corresponding multimedia messages of translation, simultaneously historical user profile is integrated together with new user profile, generated and show all chat both sides' word content of dialogue and the multimedia pictorial information of recommending out according to the content system that both sides talk.
Preferred embodiment, after the semantic association module on backstage is received the text message that the user sends over, the semantic association module can be understood the other side's the implication of speaking from the angle of the language that uses for the chat user that helps different language, comes in the result of Google translation is integrated; Like this except original user's chat message, the also subsidiary translation of having gone up this chat content based on user's chat of Google translation.
Preferred embodiment, semantic association module extract that both sides talk to the effect that with these main contents as key word, adopt text-based image retrieval from image data base, to retrieve the corresponding candidate's pictures that come out.
For realizing described purpose, a second aspect of the present invention provides a kind of use to realize striding the method for language communication based on the multi-modal auxiliary language communication system of striding, this method is based on the user session chat, the result who the talk content analysis is obtained according to the text resolution technology, exchange the semantic understanding that exists between user obstacle or that culture background there are differences with auxiliary language for the user provides multimedia element, described method performing step comprises following:
Step S1: the user at first interface, foreground by semanteme chat sends the word content of oneself wanting with the other side's chat, the text message of user's chat is transmitted by the AM/BAM interactive module that Ajax makes up in the interface, foreground to the semantic association module on backstage, employing is analyzed user's conversation content based on the cross-module attitude analytical approach of theme, utilizes the natural language processing instrument automatically to extract central topic and key word in the dialogue;
Step S2: the semantic association module adopts text-based image retrieval automatically to retrieve relevant pictures and video segment and offer the talk both sides according to conversation topics from database or internet according to central topic and keyword message in the dialogue;
Step S3: system is according to talk both sides' text chat information and corresponding with it picture and video segment content, generates a multi-modal talk summary, finally realizes semantic smoothly interchange the between the user of different language with multimedia form; Simultaneously, system can generate a multi-modal talk summary for the talk both sides according to talk both sides' text chat historical information and corresponding with it picture and video content.
Preferred embodiment, described multi-modal talk summary comprises text, audio frequency, image and video information, exchanges the semantic understanding that exists between user obstacle or that culture background there are differences with auxiliary language for the user provides multimedia element.
Preferred embodiment, described picture and video segment content are to take off automatically from network by search to get, or directly obtain from a multimedia gallery that has marked in advance.
Preferred embodiment, described multi-modal talk summary is based on the summary of theme, the relational network of use and talked last time according to statistics in appear at a predefine and expect that the word symbiosis frequency in the storehouse obtains detecting theme.
Beneficial effect of the present invention: core of the present invention is how to come text message is described by multimedia messages (image or video).What the present invention proposed can provide friendly and environment easily for online instant messaging based on the multi-modal auxiliary language communication system of striding, three principal features are arranged: first friendly, understand owing to adopted, thereby significantly reduced polysemy and the ambiguousness translated based on topic relevant image or the auxiliary content of text of video search technology; Second interactivity, the system that makes can satisfy the demand of user individual better; The 3rd ease for use, the system that is proposed can automatically generate multimedia summary according to memcon.
For interchange and understanding between the auxiliary user, system of the present invention has adopted the cross-module attitude analytical approach based on theme.System generates a multi-modal talk summary according to talk both sides' text chat information and corresponding with it picture and video content.Like this, because this multi-modal talk is by comprising abundant content, the multi-modal supplementary of promptly very visual and understandable image, video, text etc., thereby effectively eliminate the ambiguousness that the automatic translation between the plain text occurs, improve the efficient and the quality of communication, carried out semantic smoothly the interchange between the user of realization different language.
Description of drawings
Fig. 1 is the interface block diagram that the present invention is based on the multi-modal auxiliary language communication system of striding;
Fig. 2 the present invention is based on the multi-modal auxiliary structured flowchart of striding the language communication system;
Fig. 3 a and Fig. 3 b have provided the example results of a predetermined Pizza;
Fig. 4 is at the multimedia summary example of conversation content.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
The present invention proposes the method for striding language communication based on the multi-modal auxiliary language communication system of striding and realization, described method is utilized foreground interactive module 1, data management module 2 and semantic association module 3, by analyzing conversation content, utilize the natural language processing instrument can automatically extract central topic and key word in the dialogue, and semantic association module 3 is according to detected central topic and keyword message, automatically search for relevant picture and video segment and offer the talk both sides, thereby reach promotion understanding and communication each other in appropriate mode.Here,, both can take off automatically from network and get, also can from a multimedia gallery that has marked in advance, directly obtain by the method for search as auxiliary picture and the video of understanding.At last, system generates a multi-modal talk summary according to talk both sides' text chat information and corresponding with it picture and video content.
Fig. 1 shows and the present invention proposes an auxiliary User Interface of striding the multimedia chat system of language communication, it can exchange for the user who uses different language provide a close friend, can be mutual timely communication environment.The function that has wherein mainly comprised three aspects: based on the textcommunication of timely translation, a picture or a video frequency searching based on conversation topics, and at the multimedia of conversation content summary (Fig. 4 illustrates).The uppermost part of Fig. 1 mainly is to be used for the name of display system and the theme that the user chats and talks.Ensuing is the main viewing area of system interface, and promptly text conversation and multimedia supplementary show, for example asks the way, buying car, decides hotel etc.Right side part among Fig. 1 is based on the textcommunication of timely translation, user version chat zone: the text message that presents the relevant Google translation of the basic text chat pager of user; Fig. 1 left part is a picture or a video frequency searching based on conversation topics, and makes a summary the content of multimedia show area at the multimedia of conversation content: the semantic understanding of the multimedia messages assisted user that the content of talking based on the user is relevant for the user presents.
Illustrate as Fig. 2 and the present invention is based on the multi-modal auxiliary structured flowchart of striding the language communication system.Be divided into three ingredients based on the multi-modal auxiliary framework of striding the language communication system, promptly the foreground interactive module 1, data management module 2 and semantic association module 3.Wherein the foreground design comprises chat interface and mutual two parts of AM/BAM.Wherein foreground interactive module 1 is accepted the text chat content of user's input and user's chatting contents is carried out pre-service, obtains the text message of user's chat; User version chat content after user's chat word content will be handled by the output terminal of the mutual word modules of AM/BAM of foreground interactive module 1 sends semantic association module 3 to, the chat page of foreground interactive module 1 chat for the user shows both sides' word content of dialogue and the multimedia picture of recommending out according to the content system that both sides talk.
The input end of semantic association module 3 is connected with foreground interactive module 1 output terminal, receive and by after user's text chat content is analyzed, utilize the natural language processing instrument to extract the main contents that both sides talk, obtain and export text and the corresponding multimedia messages of translating in the text message association, and generate a multi-modal summary according to the content and the corresponding multimedia messages of text chat content, translation; Semantic association module 3 outputs to data management module 2 together with the content and the corresponding multimedia messages of text chat content, translation.
The input end of data management module 2 is connected output terminal and connects with semantic association module 3, data management module 2 will be stored the content and the corresponding multimedia information of new input text chat content, translation.To integrate historical user profile together with new user profile simultaneously, generate and show all chat both sides' word content of dialogue and the multimedia pictorial information of recommending out according to the content system that both sides talk; Return to foreground interactive module 1 at last in the lump.The chat page of final foreground interactive module 1 will all be shown to the user with all information.Describe the workflow of a lower module below in detail.
The user at first sends chat content by chat interface to foreground interactive module 1.The continuous semantic chat interface of asking for an interview Fig. 1 user is to be divided into two main portions, a part is exactly the part of word content that shows traditional chat both sides' dialogue, and another part is exactly the multimedia picture tabulation that shows that the content system of talking according to both sides is recommended out.The AM/BAM interactive module that interface, foreground in this time makes up by Ajax is transmitted the text message of the text chat of user's input to the backstage.Back table frame is to be divided into two parts, and a part is a data management module 2, and another part is a semantic association module 3.After the text message that the user sends over was received on the backstage, semantic association module 3 can be understood the other side's the implication of speaking from the angle of language of use of self for the chat user that helps different language, came in the result of Google translation is integrated.Like this except original user's chat message, the also subsidiary translation of having gone up this chat content based on user's chat of Google translation.3 pairs of text messages of semantic association module utilize the natural language processing instrument to extract the main contents that both sides talk.This time, semantic association module 3 at first with these main contents as key word, adopt text-based image retrieval from image data base, to retrieve the corresponding candidate's pictures that come out.All of end user and dialogue and corresponding multimedia messages can be with generating a multi-modal summary.Example results with a predetermined Pizza is that example illustrates the multimedia summary that generates, as shown in Figure 4.This that provides from Fig. 4 find out based on multi-modal summary, the user with the goods person's in Pizza shop dialogue in, carried out the selection of Pizza kind, beverage and Payment Methods.The picture of the Pizza in the corresponding Pizza shop that the user feeds back by chat system can be selected according to the wish of oneself better.This multi-modal summary also helps the user and thinks to prefer once more Pizza in the future, can help the user according to the multimedia messages that this multi-modal summary provides and look back.
Below the semantic association mechanism among Fig. 2 is set forth.Semantic association mechanism mainly is divided into three parts, promptly based on the textcommunication of instant translation, based on the video frequency searching of topic and picture and the multi-modal summary that generates based on user version chat content and corresponding multimedia messages at last.
(1). based on the textcommunication of timely translation
Similar most timely communication system, the system that the present invention proposes also supports the most basic textcommunication.But, because the both sides that talk may have different language settings.For example, when an English-speaking American and Chinese that say Chinese talk on the net, the American is ignorant of Chinese, and Chinese do not know English, and can not make the clog-free communication of both sides by common text talk.For this reason, the system integration of the present invention a simple mechanical translation function, in when chat, speaker's automatic language translation is shown for behind recipient's the language again, the both sides that so just can guarantee to talk can roughly understand the other side's intention.
(2). based on the picture and the video frequency searching of topic
Although mechanical translation is arranged, stride the communication of language and still can not make us very satisfied as bridge.Study carefully its original meaning, be that mainly the accuracy (intelligibility of Aim of Translation language) of mechanical translation is still on the low side.Translation result between main languages for example between English and the Chinese, does not still also reach practical standard.In addition, owing to the existence of polysemant in the works and expressions for everyday use and sentence, cause machine translation mothod also to be difficult to satisfy the needs of reality.Food comprises shown in Fig. 3 a: extra large food, fruit, meat.Fruit comprises: banana, apple, orange, and for example " apple " speech both can be represented a kind of fruit, (Fig. 3 is a) also can to represent Apple.In order to build a kind of online communication environment of understandable, immersion, we have designed a kind of picture based on theme/video frequency searching submodule and have assisted the user of different language background to exchange mutually.Wherein, topic detection, picture retrieval and relevant feedback are three major functions.
Topic detects and realizes by two kinds of approach.The firstth, the user selects a topic from a predefined topic tabulation.Different topics is associated with different (obtaining mark by method manual or study) picture/video databases that marked.Second method then is to extract subject key words by extracting text analyzing.In once talking with, can extract the entity speech of many expression conversation contents.According to these entity speech, we at first set up the semantic relation tree of a similar WordNet, it is portrayed the semantic inheritance between speech, shown in Fig. 3 a, speech " apple ", " banana " and " orange " all belongs to the fruit subclass in the foodstuff, and " apple " shown in Fig. 3 b speech simultaneously may be simultaneously and " Dell ", " association " belongs to this class of computer brand together, and Fig. 3 b illustrates " apple " computer brand example and comprises: desktop computer mac, panel computer ipad and smart mobile phone iphone.These above-mentioned semantic relations can extract from WordNet and obtain, and (TF-IDF) is resultant for weight also can to pass through " word frequency-reverse document frequency " of statistics word in a predefined corpus by use.In case we are drawn into keyword from dialogue, system just can automatically infer its pairing potential topic by the semantic relation between the analysis of key speech.
According to the theme that is extracted in the dialogue, system is the corresponding pictorial information of retrieval from network or background data base automatically.The retrieval of use text based, we can easily find relevant mark picture according to conversation topics.Yet most network picture does not all mark, and the picture that the text that has marked that our use retrieves is associated is as training set, and study obtains a topic model, and retrieves a large amount of not mark pictures with this topic model district.For this reason, need at first make up topic model based on the picture retrieval of theme, its target is automatically to find potential (implicit) semantic space so that the document information more accurately in the modeling retrieving.Here, the semantic structure of a document has comprised implicit notion or the theme (they are a kind of stable and distinctive symbiosis pattern between equivalent often) that some are potential.By the weighted array of potential theme, document can be expressed as a series of potential theme, and its full combination coefficient then can be regarded a kind of character representation of document as.This expression has some serial advantages: at first semantic space is compared to the word space, and dimension is often lower.This has not only saved storage space, also helps quick search; Secondly by of the conversion of word space, not only can reduce the noise in the word vector, and can solve above-mentioned ambiguity and ambiguity problem, and then improve retrieval performance to semantic space.For example, word " apple " both can be represented a kind of fruit, can represent a computer brand (Fig. 3 b) again.Its accurate meaning can same theme other relevant keywords push away.
Feedback is widely used in the analysis of textview field visual information as a kind of popular human-computer interaction technology.By the feedback evaluation of user to system's output, system can revise adaptively.Be proved to be effectively in practice by the resulting supervision message of user feedback.In our system, the user can select correct theme from the resulting candidate list of automatic subject extraction algorithm.Selected theme will be used for subject extraction (current and next step) thematic relation by the modeling sequential next time.In image retrieval, the samples pictures that retrieve that our system row are more huge, and invite the user picture concerned to be given a mark according to conversation topics.
(3). multi-modal summary
Traditional timely communication is preserved with text mode usually and is kept chat record.In our system, the user can use multi-modal modes such as picture, video and text to express talker's intention.By a kind of multi-modal mode but not single text is preserved chat message, can obtain more vivid as compared with the past record.
Text, the summary of picture and video are research focuses in natural language processing and multimedia field.It often briefly expresses original text (picture or video) information by one section more concise succinct text (picture or video).Relevant at present technology is mostly according to the conspicuousness feature, and the mode of repetition or keyword information such as (frames) makes up clip Text.In our system, consider except that text, also to have a large amount of pictures and video information that we have adopted method of abstracting that theme drives by the conversation content between analysis user and then generate summary info about specific topics.This summary info has comprised related text, picture and the video content that relates to this topic.
The above; only be the embodiment among the present invention, but protection scope of the present invention is not limited thereto, anyly is familiar with the people of this technology in the disclosed technical scope of the present invention; can understand conversion or the replacement expected, all should be encompassed within the protection domain of claims of the present invention.
Claims (7)
1. one kind based on the multi-modal auxiliary language communication system of striding, and it is characterized in that described system comprises: foreground interactive module, data management module and semantic association module, wherein:
The input end of foreground interactive module is accepted the text chat content of user's input and user's chatting contents is carried out pre-service, obtain the text message of user chat, and the output terminal of the AM/BAM interactive module by the foreground interactive module transmits the user version chat content after handling; The chat page of foreground interactive module chat for the user shows both sides' word content of dialogue and the multimedia picture of recommending out according to the content system that both sides talk;
The input end of semantic association module is connected with foreground interactive module output terminal, receive and user's text chat content is analyzed, utilize the natural language processing instrument to extract the main contents that both sides talk, obtain and export text and the corresponding multimedia messages of translating in the text message association, and generate a multi-modal summary according to the content and the corresponding multimedia messages of text chat content, translation;
The input end of data management module is connected output terminal and connects with the semantic association module, data management module will be stored the text chat content of new input, the content and the corresponding multimedia messages of translation, simultaneously historical user profile is integrated together with new user profile, generated and show all chat both sides' word content of dialogue and the multimedia pictorial information of recommending out according to the content system that both sides talk.
As claim 1 based on the multi-modal auxiliary language communication system of striding, it is characterized in that, after the semantic association module on backstage is received the text message that the user sends over, the semantic association module can be understood the other side's the implication of speaking from the angle of the language that uses for the chat user that helps different language, comes in the result of Google translation is integrated; Like this except original user's chat message, the also subsidiary translation of having gone up this chat content based on user's chat of Google translation.
As claim 1 based on the multi-modal auxiliary language communication system of striding, it is characterized in that, the semantic association module extract that both sides talk to the effect that with these main contents as key word, adopt text-based image retrieval from image data base, to retrieve the corresponding candidate's pictures that come out.
4. one kind is used the described method that realizes striding language communication based on the multi-modal auxiliary language communication system of striding of claim 1, it is characterized in that, this method is based on the user session chat, the result who the talk content analysis is obtained according to the text resolution technology, exchange the semantic understanding that exists between user obstacle or that culture background there are differences with auxiliary language for the user provides multimedia element, described method realizes may further comprise the steps:
Step S1: the user at first interface, foreground by semanteme chat sends the word content of oneself wanting with the other side's chat, the text message of user's chat is transmitted by the AM/BAM interactive module that Ajax makes up in the interface, foreground to the semantic association module on backstage, employing is analyzed user's conversation content based on the cross-module attitude analytical approach of theme, utilizes the natural language processing instrument automatically to extract central topic and key word in the dialogue;
Step S2: the semantic association module adopts text-based image retrieval automatically to retrieve relevant pictures and video segment and offer the talk both sides according to conversation topics from database or internet according to central topic and keyword message in the dialogue;
Step S3: system is according to talk both sides' text chat information and corresponding with it picture and video segment content, generates a multi-modal talk summary, finally realizes semantic smoothly interchange the between the user of different language with multimedia form; Simultaneously, system can generate a multi-modal talk summary for the talk both sides according to talk both sides' text chat historical information and corresponding with it picture and video content.
5. the method for language communication is striden in realization as claimed in claim 4, it is characterized in that, described multi-modal talk summary comprises text, audio frequency, image and video information, exchanges the semantic understanding that exists between user obstacle or that culture background there are differences with auxiliary language for the user provides multimedia element.
6. the method for language communication is striden in realization as claimed in claim 4, it is characterized in that, described picture and video segment content are to take off automatically from network by search to get, or directly obtain from a multimedia gallery that has marked in advance.
7. the method for language communication is striden in realization as claimed in claim 4, it is characterized in that, described multi-modal talk summary is based on the summary of theme, the relational network of use and talked last time according to statistics in appear at a predefine and expect that the word symbiosis frequency in the storehouse obtains detecting theme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110225342XA CN102262624A (en) | 2011-08-08 | 2011-08-08 | System and method for realizing cross-language communication based on multi-mode assistance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110225342XA CN102262624A (en) | 2011-08-08 | 2011-08-08 | System and method for realizing cross-language communication based on multi-mode assistance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102262624A true CN102262624A (en) | 2011-11-30 |
Family
ID=45009255
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110225342XA Pending CN102262624A (en) | 2011-08-08 | 2011-08-08 | System and method for realizing cross-language communication based on multi-mode assistance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102262624A (en) |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102567509A (en) * | 2011-12-26 | 2012-07-11 | 中国科学院自动化研究所 | Method and system for instant messaging with visual messaging assistance |
CN102750366A (en) * | 2012-06-18 | 2012-10-24 | 海信集团有限公司 | Video search system and method based on natural interactive import and video search server |
WO2013080214A1 (en) * | 2011-12-02 | 2013-06-06 | Hewlett-Packard Development Company, L.P. | Topic extraction and video association |
CN104536570A (en) * | 2014-12-29 | 2015-04-22 | 广东小天才科技有限公司 | Information processing method and device of intelligent watch |
CN104679733A (en) * | 2013-11-26 | 2015-06-03 | 中国移动通信集团公司 | Voice conversation translation method, device and system |
CN105260396A (en) * | 2015-09-16 | 2016-01-20 | 百度在线网络技术(北京)有限公司 | Word retrieval method and apparatus |
CN105335343A (en) * | 2014-07-25 | 2016-02-17 | 北京三星通信技术研究有限公司 | Text editing method and apparatus |
CN105898627A (en) * | 2016-05-31 | 2016-08-24 | 北京奇艺世纪科技有限公司 | Video playing method and device |
WO2016150083A1 (en) * | 2015-03-24 | 2016-09-29 | 北京搜狗科技发展有限公司 | Information input method and apparatus |
CN106295565A (en) * | 2016-08-10 | 2017-01-04 | 中用环保科技有限公司 | Monitor event identifications based on big data and in real time method of crime prediction |
WO2016197767A3 (en) * | 2016-02-16 | 2017-02-02 | 中兴通讯股份有限公司 | Method and device for inputting expression, terminal, and computer readable storage medium |
CN106682967A (en) * | 2017-01-05 | 2017-05-17 | 胡开标 | Online translation and chat system |
CN107480766A (en) * | 2017-07-18 | 2017-12-15 | 北京光年无限科技有限公司 | The method and system of the content generation of multi-modal virtual robot |
CN107798386A (en) * | 2016-09-01 | 2018-03-13 | 微软技术许可有限责任公司 | More process synergics training based on unlabeled data |
CN108027812A (en) * | 2015-09-18 | 2018-05-11 | 迈克菲有限责任公司 | System and method for multipath language translation |
CN108173747A (en) * | 2017-12-27 | 2018-06-15 | 上海传英信息技术有限公司 | Information interacting method and device |
CN108255939A (en) * | 2017-12-08 | 2018-07-06 | 北京搜狗科技发展有限公司 | A kind of cross-language search method and apparatus, a kind of device for cross-language search |
CN108369585A (en) * | 2015-11-30 | 2018-08-03 | 三星电子株式会社 | Method for providing translation service and its electronic device |
CN108874787A (en) * | 2018-06-12 | 2018-11-23 | 深圳市合言信息科技有限公司 | A method of analysis speech intention simultaneously carries out depth translation explanation |
CN109255130A (en) * | 2018-07-17 | 2019-01-22 | 北京赛思美科技术有限公司 | A kind of method, system and the equipment of language translation and study based on artificial intelligence |
CN109817351A (en) * | 2019-01-31 | 2019-05-28 | 百度在线网络技术(北京)有限公司 | A kind of information recommendation method, device, equipment and storage medium |
CN110209772A (en) * | 2019-06-17 | 2019-09-06 | 科大讯飞股份有限公司 | A kind of text handling method, device, equipment and readable storage medium storing program for executing |
CN110706771A (en) * | 2019-10-10 | 2020-01-17 | 复旦大学附属中山医院 | Method and device for generating multi-mode education content, server and storage medium |
CN111651674A (en) * | 2020-06-03 | 2020-09-11 | 北京妙医佳健康科技集团有限公司 | Bidirectional searching method and device and electronic equipment |
CN112307156A (en) * | 2019-07-26 | 2021-02-02 | 北京宝捷拿科技发展有限公司 | Cross-language intelligent auxiliary side inspection method and system |
CN113055275A (en) * | 2016-08-30 | 2021-06-29 | 谷歌有限责任公司 | Conditional disclosure of individually controlled content in a group context |
CN113656613A (en) * | 2021-08-20 | 2021-11-16 | 北京百度网讯科技有限公司 | Method for training image-text retrieval model, multi-mode image retrieval method and device |
WO2021233112A1 (en) * | 2020-05-20 | 2021-11-25 | 腾讯科技(深圳)有限公司 | Multimodal machine learning-based translation method, device, equipment, and storage medium |
CN114663246A (en) * | 2022-05-24 | 2022-06-24 | 中国电子科技集团公司第三十研究所 | Representation modeling method of information product in propagation simulation and multi-agent simulation method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101834809A (en) * | 2010-05-18 | 2010-09-15 | 华中科技大学 | Internet instant message communication system |
CN101251855B (en) * | 2008-03-27 | 2010-12-22 | 腾讯科技(深圳)有限公司 | Equipment, system and method for cleaning internet web page |
US20110153752A1 (en) * | 2009-12-21 | 2011-06-23 | International Business Machines Corporation | Processing of Email Based on Semantic Relationship of Sender to Recipient |
-
2011
- 2011-08-08 CN CN201110225342XA patent/CN102262624A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101251855B (en) * | 2008-03-27 | 2010-12-22 | 腾讯科技(深圳)有限公司 | Equipment, system and method for cleaning internet web page |
US20110153752A1 (en) * | 2009-12-21 | 2011-06-23 | International Business Machines Corporation | Processing of Email Based on Semantic Relationship of Sender to Recipient |
CN101834809A (en) * | 2010-05-18 | 2010-09-15 | 华中科技大学 | Internet instant message communication system |
Non-Patent Citations (1)
Title |
---|
XINMING ZHANG, ET AL: "A visualized Communication System Using Cross-Media Semantic Association", 《17TH INTERNATIONAL MULTIMEDIA MODELING CONFERENCE》, 7 January 2011 (2011-01-07), pages 88 - 98, XP019159534 * |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9645987B2 (en) | 2011-12-02 | 2017-05-09 | Hewlett Packard Enterprise Development Lp | Topic extraction and video association |
WO2013080214A1 (en) * | 2011-12-02 | 2013-06-06 | Hewlett-Packard Development Company, L.P. | Topic extraction and video association |
CN102567509B (en) * | 2011-12-26 | 2014-08-27 | 中国科学院自动化研究所 | Method and system for instant messaging with visual messaging assistance |
CN102567509A (en) * | 2011-12-26 | 2012-07-11 | 中国科学院自动化研究所 | Method and system for instant messaging with visual messaging assistance |
CN102750366A (en) * | 2012-06-18 | 2012-10-24 | 海信集团有限公司 | Video search system and method based on natural interactive import and video search server |
CN104679733A (en) * | 2013-11-26 | 2015-06-03 | 中国移动通信集团公司 | Voice conversation translation method, device and system |
US11790156B2 (en) | 2014-07-25 | 2023-10-17 | Samsung Electronics Co., Ltd. | Text editing method and electronic device supporting same |
US10878180B2 (en) | 2014-07-25 | 2020-12-29 | Samsung Electronics Co., Ltd | Text editing method and electronic device supporting same |
CN105335343A (en) * | 2014-07-25 | 2016-02-17 | 北京三星通信技术研究有限公司 | Text editing method and apparatus |
CN104536570A (en) * | 2014-12-29 | 2015-04-22 | 广东小天才科技有限公司 | Information processing method and device of intelligent watch |
US10628524B2 (en) | 2015-03-24 | 2020-04-21 | Beijing Sogou Technology Development Co., Ltd. | Information input method and device |
WO2016150083A1 (en) * | 2015-03-24 | 2016-09-29 | 北京搜狗科技发展有限公司 | Information input method and apparatus |
CN105260396B (en) * | 2015-09-16 | 2019-09-03 | 百度在线网络技术(北京)有限公司 | Word retrieval method and device |
CN105260396A (en) * | 2015-09-16 | 2016-01-20 | 百度在线网络技术(北京)有限公司 | Word retrieval method and apparatus |
CN108027812A (en) * | 2015-09-18 | 2018-05-11 | 迈克菲有限责任公司 | System and method for multipath language translation |
CN108369585A (en) * | 2015-11-30 | 2018-08-03 | 三星电子株式会社 | Method for providing translation service and its electronic device |
CN108369585B (en) * | 2015-11-30 | 2022-07-08 | 三星电子株式会社 | Method for providing translation service and electronic device thereof |
WO2016197767A3 (en) * | 2016-02-16 | 2017-02-02 | 中兴通讯股份有限公司 | Method and device for inputting expression, terminal, and computer readable storage medium |
CN105898627A (en) * | 2016-05-31 | 2016-08-24 | 北京奇艺世纪科技有限公司 | Video playing method and device |
CN105898627B (en) * | 2016-05-31 | 2019-04-12 | 北京奇艺世纪科技有限公司 | A kind of video broadcasting method and device |
CN106295565A (en) * | 2016-08-10 | 2017-01-04 | 中用环保科技有限公司 | Monitor event identifications based on big data and in real time method of crime prediction |
CN113055275B (en) * | 2016-08-30 | 2022-08-02 | 谷歌有限责任公司 | Conditional disclosure of individually controlled content in a group context |
CN113055275A (en) * | 2016-08-30 | 2021-06-29 | 谷歌有限责任公司 | Conditional disclosure of individually controlled content in a group context |
CN107798386A (en) * | 2016-09-01 | 2018-03-13 | 微软技术许可有限责任公司 | More process synergics training based on unlabeled data |
CN106682967A (en) * | 2017-01-05 | 2017-05-17 | 胡开标 | Online translation and chat system |
CN107480766A (en) * | 2017-07-18 | 2017-12-15 | 北京光年无限科技有限公司 | The method and system of the content generation of multi-modal virtual robot |
WO2019109664A1 (en) * | 2017-12-08 | 2019-06-13 | 北京搜狗科技发展有限公司 | Cross-language search method and apparatus, and apparatus for cross-language search |
CN108255939A (en) * | 2017-12-08 | 2018-07-06 | 北京搜狗科技发展有限公司 | A kind of cross-language search method and apparatus, a kind of device for cross-language search |
CN108255939B (en) * | 2017-12-08 | 2020-02-14 | 北京搜狗科技发展有限公司 | Cross-language search method and device for cross-language search |
CN108173747B (en) * | 2017-12-27 | 2021-10-22 | 上海传英信息技术有限公司 | Information interaction method and device |
CN108173747A (en) * | 2017-12-27 | 2018-06-15 | 上海传英信息技术有限公司 | Information interacting method and device |
CN108874787A (en) * | 2018-06-12 | 2018-11-23 | 深圳市合言信息科技有限公司 | A method of analysis speech intention simultaneously carries out depth translation explanation |
CN109255130A (en) * | 2018-07-17 | 2019-01-22 | 北京赛思美科技术有限公司 | A kind of method, system and the equipment of language translation and study based on artificial intelligence |
CN109817351A (en) * | 2019-01-31 | 2019-05-28 | 百度在线网络技术(北京)有限公司 | A kind of information recommendation method, device, equipment and storage medium |
CN110209772B (en) * | 2019-06-17 | 2021-10-08 | 科大讯飞股份有限公司 | Text processing method, device and equipment and readable storage medium |
CN110209772A (en) * | 2019-06-17 | 2019-09-06 | 科大讯飞股份有限公司 | A kind of text handling method, device, equipment and readable storage medium storing program for executing |
CN112307156A (en) * | 2019-07-26 | 2021-02-02 | 北京宝捷拿科技发展有限公司 | Cross-language intelligent auxiliary side inspection method and system |
CN110706771A (en) * | 2019-10-10 | 2020-01-17 | 复旦大学附属中山医院 | Method and device for generating multi-mode education content, server and storage medium |
WO2021233112A1 (en) * | 2020-05-20 | 2021-11-25 | 腾讯科技(深圳)有限公司 | Multimodal machine learning-based translation method, device, equipment, and storage medium |
CN111651674A (en) * | 2020-06-03 | 2020-09-11 | 北京妙医佳健康科技集团有限公司 | Bidirectional searching method and device and electronic equipment |
CN111651674B (en) * | 2020-06-03 | 2023-08-25 | 北京妙医佳健康科技集团有限公司 | Bidirectional searching method and device and electronic equipment |
CN113656613A (en) * | 2021-08-20 | 2021-11-16 | 北京百度网讯科技有限公司 | Method for training image-text retrieval model, multi-mode image retrieval method and device |
CN114663246A (en) * | 2022-05-24 | 2022-06-24 | 中国电子科技集团公司第三十研究所 | Representation modeling method of information product in propagation simulation and multi-agent simulation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102262624A (en) | System and method for realizing cross-language communication based on multi-mode assistance | |
CN106777018B (en) | Method and device for optimizing input sentences in intelligent chat robot | |
US11580350B2 (en) | Systems and methods for an emotionally intelligent chat bot | |
WO2016197767A2 (en) | Method and device for inputting expression, terminal, and computer readable storage medium | |
CN102567509B (en) | Method and system for instant messaging with visual messaging assistance | |
US20100100371A1 (en) | Method, System, and Apparatus for Message Generation | |
EP3423956A1 (en) | Interpreting and resolving conditional natural language queries | |
JP2023535709A (en) | Language expression model system, pre-training method, device, device and medium | |
Sardinha | 25 years later | |
CN105808695A (en) | Method and device for obtaining chat reply contents | |
CN111241237A (en) | Intelligent question and answer data processing method and device based on operation and maintenance service | |
KR20110115543A (en) | Method for calculating entity similarities | |
CN107491477A (en) | A kind of emoticon searching method and device | |
CN117056471A (en) | Knowledge base construction method and question-answer dialogue method and system based on generation type large language model | |
US10592609B1 (en) | Human emotion detection | |
CN104735480A (en) | Information sending method and system between mobile terminal and television | |
KR20210002619A (en) | Creation of domain-specific models in network systems | |
CN106202200B (en) | A kind of emotion tendentiousness of text classification method based on fixed theme | |
CN115357755B (en) | Video generation method, video display method and device | |
EP3762876A1 (en) | Intelligent knowledge-learning and question-answering | |
CN106156262A (en) | A kind of search information processing method and system | |
KR20110090675A (en) | System and method for generating sign language animation | |
CN114064943A (en) | Conference management method, conference management device, storage medium and electronic equipment | |
Knight | A multi-modal corpus approach to the analysis of backchanneling behaviour | |
US11929100B2 (en) | Video generation method, apparatus, electronic device, storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20111130 |