CN101888345A - Method for implementing on-line user search through instant messenger - Google Patents

Method for implementing on-line user search through instant messenger Download PDF

Info

Publication number
CN101888345A
CN101888345A CN2009100510963A CN200910051096A CN101888345A CN 101888345 A CN101888345 A CN 101888345A CN 2009100510963 A CN2009100510963 A CN 2009100510963A CN 200910051096 A CN200910051096 A CN 200910051096A CN 101888345 A CN101888345 A CN 101888345A
Authority
CN
China
Prior art keywords
user
msn
search
web
search engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2009100510963A
Other languages
Chinese (zh)
Inventor
王雨豪
王成彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI JILUE NETWORKS INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI JILUE NETWORKS INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI JILUE NETWORKS INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI JILUE NETWORKS INFORMATION TECHNOLOGY Co Ltd
Priority to CN2009100510963A priority Critical patent/CN101888345A/en
Publication of CN101888345A publication Critical patent/CN101888345A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a method for implementing on-line user search through an instant messenger. An MSN base communication module, a multi-robot account management module, a robot session communication module, a service logic module and a background database service module are applied. The whole system users can be divided into MSN users and web users (Xmpp), session of users and robots is constructed on line by adopting a distributed cluster mode, and the service logic module sends the session result to the background database service module. The method can effectively realize intercommunication of Web and im tools, realize user communication of a multi-channel mode, realize interaction on msn and web, realize the on-line user search, meanwhile support instant search of msn and web, and reach chatting interaction between the msn and the web and search of on-line users.

Description

A kind of method that realizes online user search by immediate communication tool
Technical field
The present invention relates to a kind ofly realize the method for online user search, relate in particular to the method for online user's search, can realize the intercommunication of Web and Im instrument, the field of mode user communication by all kinds of means is provided by immediate communication tool.
Background technology
XMPP (Extensible Messageing and Presence Protocol: scalable message with have an agreement) is four kinds of IM (IM:instant messaging of present main flow, instant message) one of agreement, other three kinds are respectively: instant messages and space agreement (IMPP), space and instant messages agreement (PRIM), the process that expands at instant messaging and spatial balance begin agreement SIP (SIMPLE).
In these four kinds of agreements, XMPP is the most flexibly.XMPP is a kind of agreement based on XML, and it has inherited in the XML environment expansionary flexibly.Therefore, the application based on XMPP has superpower extensibility.Can come the demand of process user by the information that sends expansion through the later XMPP of expansion, and set up on the top of XMPP as content delivering system with based on the application programs such as service of address.And XMPP has comprised the software protocol at server end, makes it to converse with another, and this makes that the developer is easier and sets up client applications or add function for the system for preparing.
Is what meaning IM? IM is the abbreviation of Instant Messaging, the Chinese meaning of IM is an instant messages, the instrument that finger can online in real time exchanges, just usually said online chatting instrument all is the IM chat software that the Internet user uses always as QQ, MSN, the UC of Sina, TQ etc.The attached common instant messages IM software download address instant messages in back (IM) just came into vogue as far back as 1996, and foremost at that time JICQ is ICQ.ICQ is developed by three Israelis at first, is purchased by America Online in 1998, remains one of most popular instant messenger now.To the end of the year 2003, the ICQ number of users in the whole world surpasses 1,500,000,000, wherein is distributed in the countries in the world outside the U.S. more than 60%.
The IM instant messages has at the individual to be used and that enterprise uses is dissimilar, present prevailing is individual chat tool, and be free service mostly. Messenger), MSN courier (MSN Messenger), the instant courier of AOL (AIM) etc., and QQ, the Sina UC etc. of instant messages chat tool that provide of website in the country as rising fast company.In addition, the real-time interchange between the online user of inside, a website also is a kind of concrete application form of instant messages.
Search engine (search engine) is meant according to certain strategy, the specific computer program of utilization collects the information on the Internet, after information being organized and handled, provides the system of retrieval service for the user.From user's angle, search engine provides a page that comprises search box, at search box input word, submit to search engine by browser after, search engine will return the content-related information tabulation with user's input.Search engine refers to gather information from internet (Internet) automatically, through necessarily putting in order with Hou, offers the system that the user inquires about.With the gimmick of metaphor, the information vastness on the internet is multifarious, and has no order, all information resembles the island one by one on the vast sea, and web page interlinkage is a bridge crisscross between these islands, and search engine, then, consult at any time for you for you draw an open-and-shut information map.At the internet development initial stage, the website is less relatively, and the information searching ratio is easier to.Yet follow the volatile development in the Internet, normal network users wants to find required data simply as looking for a needle in a haystack, and has at this moment just arisen at the historic moment for the professional search website that satisfies the mass information Search Requirement.
The ancestors of the search engine on the Modern Significance are the Archie of nineteen ninety by the student AlanEmtage of University of Montreal invention.Though World Wide Web did not also occur at that time, but file transfer is still quite frequent in the network, and owing to a large amount of files is dispersed in the FTP main frame of each dispersion, inquiry is got up very inconvenient, therefore Alan Emtage expected developing one can be with the system of filename locating file, so Archie has just been arranged.
Archie operation principle and present search engine are very approaching, and it relies on the file on the automatic dragnet of shell script, to carrying out index for information about, inquire about with the certain expression formula for the user then.Because Archie receives praises from customers, inspired by it, Nevada ,Usa System ComputingServices university has developed another closely similar with it research tool in 1993, but this moment research tool except index file, can searching web pages.
At that time, " robot " speech was all the fashion in programmer.Computer " robot " (ComputerRobot) is meant that certain can carry out the software program of a certain task incessantly with mankind's speed that is beyond one's reach.Crawl between network as spider owing to be specifically designed to " robot " program of retrieving information, therefore, " robot " program of search engine just is called as " spider " program.
First " robot " program that is used to monitor internet development scale World wide Web Wanderer that is the MatthewGray exploitation in the world.Just begun it and only be used for adding up number of servers on the Internet, then developed into afterwards and can retrieve the website domain name.
Corresponding with Wanderer, Martin Koster has created ALIWEB in October, 1993, and it is the HTTP version of Archie.ALIWEB does not use " robot " program, but sets up the link index of oneself by initiatively submission information of website, is similar to the present Yahoo that we know.
Along with Internet fast development, make all emerging webpages of retrieval become more and more difficult, therefore, on the Wanderer basis of Matthew Gray, some programmer have been done a little improvement with traditional " spider " program work principle.Its imagination is since all webpages all have the link that connects to other websites, from following the tracks of the link of a website, just might retrieve whole the Internet so.To the end of the year 1993, some search engines based on this principle begin to emerge in large numbers one after another, wherein with JumpStation, the The World Wide Web Worm (predecessor of Goto, just today Overture) and Repository-Based Software Engineering (RBSE) spider the most well-known.
Yet JumpStation and WWW Worm just find the precedence of match information to arrange Search Results in database with research tool, therefore have no information relevance and can say.And RBSE is first introduces keyword string matching degree notion in Search Results is arranged a engine.
Search engine on the Modern Significance comes across in July, 1994 the earliest.Michael Mauldin was linked into the spider of John Leavitt in its concordance program at that time, had created the Lycos that everybody knows now.April in the same year, two doctors of Stamford (Stanford) university, David Filo and Chinese American's Jerry Yang (Gerry Yang) have been established super directory index Yahoo jointly, and the notion of search engine is rooted in the hearts of the people.From then on search engine has entered the high-speed developing period.At present, person who really exists's search engine has reached hundreds of families on the Internet, and the amount of information of its retrieval also with in the past cannot be mentioned in the same breath.The webpage of depositing in its database has reached 3,000,000,000 such as the Google of the nearest positive strength of public attention!
Rapid expansion along with internet scale, one tame search engine is depended the only bucket of own singles alone can't adapt to present market situation, therefore begin to have occurred sharing out the work and helping one another between the search engine now, and professional search engine technique and search database service provider have been arranged.Resemble external Inktomi (being purchased by Yahoo), itself is not direct user oriented search engine, but to comprising that Overture (former GoTo is purchased by Yahoo), LookSmart, MSN, HotBot etc. provide Webpage search service in full at other interior search engines.Domestic Baidu also belongs to this class, and what Sohu and Sina used is exactly its technology.Therefore in this sense, they are search engines of search engine.
Search engine mainly can be divided into three kinds by its working method, is respectively full-text search engine (Full TextSearch Engine), directory index class search engine (Search Index/Directory) and META Search Engine (Meta Search Engine).
■ full-text search engine
The full-text search engine is genuine search engine, external representative have Google, Fast/AllTheWeb, AltaVista, Inktomi, Teoma, WiseNut etc., and prominent domestic have a Baidu (Baidu).They all are the database of setting up by the information of each website of extracting from the Internet (based on the webpage literal), the relative recording of retrieval and user inquiring condition coupling, by certain arrangement sequence the result is returned to the user then, so they are real search engines.
Angle from the Search Results source, the full-text search engine can be subdivided into two kinds again, a kind of is the search program (Indexer) that has oneself, be commonly called as " spider " (Spider) (Robot) program of program or " robot ", and self-built web database, Search Results directly calls from self database, as mentioned above 7 tame engines; Another kind of then be the database of renting other engines, and by the format permutation Search Results of making by oneself, as the Lycos engine.
The ■ directory index
Though directory index has function of search, being not really on stricti jurise is real search engine, only is the web site url tabulation by catalog classification.The user can carry out keyword (Keywords) inquiry fully, only depends on the information that classified catalogue also can find to be needed.The most representative in the directory index no more than of great reputation Yahoo Yahoo.Open DirectoryProject (DMOZ), LookSmart, About etc. in addition that other are famous.Domestic Sohu, Sina, Netease's search also all belong to this class.
META Search Engine (META Search Engine)
META Search Engine simultaneously at the enterprising line search of other a plurality of engines, and returns to the user with the result when accepting the user inquiring request.Famous META Search Engine has (META Search Engine tabulations) such as InfoSpace, Dogpile, Vivisimo, the representative star search engine of searching in the Chinese META Search Engine.Aspect the Search Results arrangement, the directly engine arrangement by sources Search Results that has, as Dogpile, what have then rearranges combination by the rule of making by oneself with the result, as Vivisimo.
Except that above-mentioned three major types engine, also have following several non-mainstream forms:
1, aggregation type search engine: as the engine of HotBot in release in the end of the year 2002.The similar META search engine of this engine is searched for but difference is not to be to call simultaneously a plurality of engines, but is selected in the middle of 4 engines that provide by the user, therefore is its " aggregation type " search engine more precisely.
2, door search engine: though provide search service as AOL Search, MSN Search etc., self promptly do not have classified catalogue also not have web database, its Search Results is fully from other engines.
3, free lists of links (Free For All Links is called for short FFA): general only the rolling simply in this class website arranged the link clauses and subclauses, and small part has simple classified catalogue, but scale is come much smaller compared with directory index such as Yahoo.
Because above-mentioned website is all for the user provides the search inquiry service, for simplicity, we are referred to as search engine with it usually.
The full-text search engine
We mentioned the full-text search engine is set up web database from the website information extraction notion at the search engine classified part.The automatic information of search engine is collected function and is divided two kinds.A kind of is periodic search, (generally is 28 days such as Google) at set intervals promptly, search engine is initiatively sent " spider " program, internet site in certain IP address range is retrieved, in case find new website, it can extract the information of website and the database that network address adds oneself automatically.
Another kind is to submit site search to, be that website owner is initiatively submitted network address to search engine, (2 days to several months do not wait) are directed within a certain period of time sends " spider " program to your website for it, scan your website and will deposit database for information about in, in order to user inquiring.Because great changes have taken place in the search engine index rule in recent years, initiatively submit to network address not guarantee that your website can enter search engine database, therefore best bet is some external linkages that obtain more at present, and multimachine can find you and automatically your website be included to allow search engine have more.
As user during with keyword lookup information, search engine can be searched in database, if find the website that conforms to the customer requirements content, just adopt special algorithm---common matching degree according to keyword in the webpage, position/the frequency that occurs, link quality etc.---calculate the degree of correlation and the rank grade of each webpage, according to degree of association height, in order these web page interlinkages are returned to the user then.
The ■ directory index
Compare with the full-text search engine, directory index has many differences.
At first, search engine belongs to automatic retrieved web, and directory index then relies on manual operations fully.After the user submitted the website to, the directory editing personnel can browse your website in person, and the master according to the judgment criteria made by oneself of cover even editorial staff sees impression, the website whether decision admits you then.
Secondly, when search engine is included the website,, generally can both login success as long as relevant rule is not violated in website itself.Directory index is then much higher to the requirement of website, even repeatedly also not necessarily success of login sometimes. Super index like this, login is difficulty especially. The difficulty maximum, and it is businessman's network marketing hotly contested spot, thus we can be in the back with the skill of special length introduction login Yahoo Yahoo)
In addition, when the login search engine, we generally need not consider the classification problem of website, then the website must be placed on an only catalogue (Directory) during the login directory index.
At last, all extractions automatically from user's webpage for information about of each website in the search engine are so user's angle sees that we have more independence power; Directory index then requires and must manual fill in site information in addition, and also has various restrictions.What is more, if the staff thinks that you submit to the catalogue of website, site information improper, he can adjust it at any time, can not discuss with you in advance certainly.
Directory index, as the term suggests exactly the website is left in the corresponding catalogue categorizedly, so the user can select keyword search when Query Information, also can successively search by classified catalogue.As with keyword search, the result who returns is the same with search engine, also is to arrange the website according to the associating information degree, and only wherein human factor is more.If search by gradation directory, the rank of website then is the sequencing decision (exception is also arranged) by the title letter in a certain catalogue.
At present, search engine and directory index have the trend of mutual synthesis and ubiquit.Formerly serve pure full-text search engine directory search also is provided now, just use Open Directory catalogue as Google classified inquiry is provided. These old brand directory index are then by enlarging the hunting zone with search engine cooperation such as Google.Under the default search pattern, what some catalogue class search engines at first returned is the website of mating in the own catalogue, as domestic Sohu, Sina, Netease etc.; What other was then given tacit consent to is Webpage search, as Yahoo.
Chinese word segmentation and search engine
As everyone knows, English is unit with the speech, be to separate by the space between speech and the speech, and Chinese is to be unit with the word, and all words link up and could describe a meaning in the sentence.For example, english sentence I am a student with Chinese then is: " I am a student ".Computer can very simply know that by the space student is a word, but can not be readily understood that " ", " life " two words just represent a speech altogether.The Chinese character sequence of Chinese is cut into significant speech, is exactly Chinese word segmentation, and some people is also referred to as and cuts speech.I am a student, and the result of participle is: I am a student.
Chinese words segmentation
Chinese words segmentation belongs to the natural language processing technique category, and for a word, the people can understand which is a speech by the knowledge of oneself, which is not a speech, but how to allow computer can understand yet? its processing procedure is divided word algorithm exactly.
Word algorithm can be divided into three major types in existing minute: based on the segmenting method of string matching, based on the segmenting method of understanding with based on the segmenting method of adding up.
1, based on the segmenting method of string matching
This method is called mechanical segmentation method again, and it is according to certain strategy the entry in Chinese character string to be analyzed and one " fully big " machine dictionary to be joined, if find certain character string in dictionary, then the match is successful (identifying a speech).According to the difference of scanning direction, string coupling segmenting method can be divided into forward coupling and reverse coupling; According to the situation of the preferential coupling of different length, can be divided into maximum (the longest) coupling and minimum (the shortest) coupling; According to whether combining, can be divided into the integral method that simple segmenting method and participle combine with mark again with the part-of-speech tagging process.Several mechanical segmentation methods commonly used are as follows:
1) forward maximum matching method (by left-to-right direction);
2) reverse maximum matching method (by the direction of the right side) to a left side;
3) minimum cutting (making the speech that cuts out in each count minimum).
Above-mentioned the whole bag of tricks can also be made up mutually, for example, maximum matching process of forward and reverse maximum matching process can be combined the two-way matching method of formation.Because Chinese word becomes speech, forward smallest match and reverse smallest match are generally seldom used.In general, the cutting precision of reverse coupling is mated a little more than forward, and the ambiguity phenomenon that runs into is also less.Statistics shows that using the error rate of the maximum coupling of forward merely is 1/169, and using the error rate of reverse maximum coupling merely is 1/245.But this precision also can not satisfy actual needs far away.The actual Words partition system that uses, all be mechanical Chinese word segmentation as a kind of branch means just, also need further improve the accuracy rate of cutting by utilizing various other language messages.
A kind of method is to improve scan mode, be called mark scanning or sign cutting, preferential identification and be syncopated as the speech that some have obvious characteristic in character string to be analyzed, with these speech as breakpoint, former character string can be divided into less string and advance mechanical Chinese word segmentation again, thereby reduce the error rate of mating.Another kind method is that participle and part-of-speech tagging are combined, and utilizes abundant grammatical category information that participle is made a strategic decision and offers help, and conversely word segmentation result is tested, adjusted again in the mark process, thereby greatly improve the accuracy rate of cutting.
For mechanical segmentation method, can set up a general model, the scientific paper of specialty is arranged in this respect, do not do detailed argumentation here.
2, based on the segmenting method of understanding
This segmenting method is by allowing the understanding of anthropomorphic distich of computer mould, reaching the effect of identification speech.Its basic thought is exactly to carry out sentence structure, semantic analysis in participle, utilizes syntactic information and semantic information to handle the ambiguity phenomenon.It generally includes three parts: participle subsystem, syntactic-semantic subsystem, master control part.Under the coordination of master control part, the participle subsystem can obtain the sentence structure and the semantic information of relevant speech, sentence etc. and come the participle ambiguity is judged that promptly it has simulated the understanding process of people to sentence.This segmenting method need use a large amount of linguistries and information.Because general, the complexity of Chinese language knowledge are difficult to various language messages are organized into the form that machine can directly read, and therefore also are in experimental stage based on the Words partition system of understanding at present.
3, based on the segmenting method of adding up
From in form, speech is the combination of stable word, and therefore in context, the number of times that adjacent word occurs simultaneously is many more, just might constitute a speech more.Therefore word and the frequency or the probability of the adjacent co-occurrence of the word confidence level that can reflect into speech preferably.Can add up the frequency of the combination of each word of adjacent co-occurrence in the language material, calculate their information that appears alternatively.The information that appears alternatively of two words of definition, the adjacent co-occurrence probabilities of calculating two Chinese character X, Y.The information of appearing alternatively has embodied the tightness degree of marriage relation between the Chinese character.When tightness degree is higher than some threshold values, can think that just this word group may constitute a speech.This method only needs to add up the word group frequency in the language material, does not need the cutting dictionary, thereby is called no dictionary again and divides morphology or statistics to get the speech method.But this method also has certain limitation, meeting is often extracted some co-occurrence frequency height out but is not the everyday character group of speech, for example " this ", " one of ", " having ", " I ", " many " etc., and poor to the accuracy of identification of everyday words, the space-time expense is big.The statistics Words partition system of practical application all will use a basic dictionary for word segmentation (everyday words dictionary) to go here and there the coupling participle, use statistical method to discern some new speech simultaneously, soon string is added up frequently and is gone here and there to mate and combines, both brought into play the characteristics that coupling participle cutting speed is fast, efficient is high, utilized the advantage of no dictionary participle again in conjunction with context identification new word, automatic disambiguation.
Which kind of divides the accuracy of word algorithm higher on earth, there is no final conclusion at present.For any one ripe Words partition system, can not rely on a certain algorithm to realize separately, all need comprehensive different algorithm.The author understands, and the branch word algorithm of magnanimity science and technology just adopts " compound divides morphology ", so-called compound, be equivalent to promptly just integrate treating disease with different medicines with the compound notion in the Chinese medicine, same, for the identification of Chinese word, need multiple algorithm to handle different problems.
Difficult problem in the participle
Ripe branch word algorithm has been arranged, just be easy to solve the problem of Chinese word segmentation? the fact is far from so.Chinese is a kind of very complicated language, makes the computer understanding Chinese language difficult especially.In the Chinese word segmentation process, there are two hang-ups never to break through fully.
1, ambiguity identification
Ambiguity is meant same a word, has two kinds or more cutting method.For example: the surface, because " surface " and " face " all is speech, this phrase just can be divided into " surface " and " surface " so.This intersection ambiguity that is called.This intersection ambiguity of picture is very common, and the example of for " kimonos " is exactly because the mistake that the intersection ambiguity causes in fact previously." make up and clothes " and can be divided into " making up and clothes " or " making up and clothes ".Because nobody's knowledge goes to understand, computer is difficult to know which scheme is correct on earth.
It is to can be said to be than being easier to handle that the ambiguity of intersecting makes up ambiguity relatively, and the combination ambiguity just must have been judged according to whole sentence.For example, in sentence " this door handle has been broken ", " handle " is a speech, but in sentence " asked handle to be taken away ", " handle " was not a speech just; In sentence " general appointed in will ", " middle will " be a speech, but in sentence " output will increase twice in 3 years ", and " middle will " be speech no longer just.How about do these word computers go identification?
If the ambiguity of intersecting can both solve with combination ambiguity computer, in ambiguity, also have a difficult problem, be true ambiguity.The true ambiguity meaning is to provide in short, is gone to judge by the people and does not know also which should be a speech, and which should not be a speech.For example: " table tennis bat is sold and is over ", can be cut into " table tennis bat is sold and is over ", also can be cut into " table tennis bat is sold and is over ", if there is not other sentence of context, probably who does not know that " auction " counts a speech here yet.
2, neologisms identification
Neologisms, technical term is called unregistered word.Just those were not all included in dictionary, but can be called those speech of speech really.Most typical is name, and the people can be readily appreciated that in the sentence " Wang Junhu has gone to Guangzhou " that " Wang Junhu " is a speech, because be a people's name, if but allow computer go identification just difficult.If " Wang Junhu " be indexed in the dictionary as a speech go, there is so much name in the whole world, and all the time newly-increased name is arranged all, and including these names itself is exactly a huge engineering.Even this work can be finished, still can have problems, can or can not for example: in sentence " Wang Jun is looking strong and good-natured ", " Wang Junhu " calculate speech?
In the neologisms except name, also having mechanism's name, place name, ProductName, trade (brand) name, abbreviation, ellipsis etc. all is the problem of intractable, and these just in time are again the speech that people often use, and therefore for search engine, the neologisms identification in the Words partition system is very important.The neologisms recognition accuracy has become one of important symbol of estimating a Words partition system quality at present.
The application of Chinese word segmentation
In natural language processing technique, the Chinese language processing technology is than the backward very big segment distance of western language treatment technology at present, and the processing method Chinese of many western languages can not directly adopt, exactly because Chinese must have this procedure of participle.Chinese word segmentation is the basis of other Chinese information processing, and search engine is an application of Chinese word segmentation.Other such as Machine Translation (MT), phonetic synthesis, classification automatically, autoabstract, check and correction or the like automatically, all need to use participle.Because Chinese needs participle, may influence some researchs, but,, at first also be to solve the Chinese word segmentation problem because external Computer Processing technology wants to enter Chinese market simultaneously also for some enterprises bring chance.Aspect Chinese research, compare the foreigner, Chinese have obvious advantages.
The participle accuracy is very important concerning search engine, if but participle speed is too slow, even accuracy is high again, also be disabled for search engine, because search engine need be handled hundreds of millions of webpages, if the overlong time that participle consumes can have a strong impact on the speed of search engine content update.Therefore for search engine, the accuracy of participle and speed, the two all needs the requirement that reaches very high.The research Chinese word segmentation is research institutions mostly at present, all there are the research troop of oneself in Tsing-Hua University, Beijing University, the Chinese Academy of Sciences, Beijing Language Institute, Northeastern University, IBM research institute, Microsoft Research, China etc., and the commercial company of real specialty research Chinese word segmentation has not almost had except magnanimity science and technology.The technology of research institutions research, major part can not very fast commercialization, and the strength of a specialized company is limited after all, it seems that Chinese words segmentation wants better service in more products, also has very long stretch.
Summary of the invention:
The objective of the invention is to deficiency at above conventional art, a kind of method that realizes online user search by immediate communication tool is provided, can realize the intercommunication of Web and im instrument, mode user communication by all kinds of means is provided, is provided at msn, web is last mutual, online user search is provided, support the instant search of msn and web simultaneously, and the chat of reaching between them is mutual, online user's search engine technique.
In order to realize that this purpose robot is mainly by forming as lower module:
1, MSN basis communication layers:
Main being responsible for and the intercommunication of msn im instrument supports msn8 to the msn15 agreement.
For the consideration of height concurrency, change more high performance Mina communications framework into bottom is logical;
2, multirobot account management layer
Adopt distributedly, trunking mode makes up account management, is responsible for account's interpolation, and deletion is enabled, and stops, and the online user gathers, the user profile collection.
3, robot session communication layer
Adopt trunking mode to make up the session of user and robot.Be responsible for user's establishment, follow the tracks of the Web chat sessions
The main Xmpp agreement that adopts, server end adopts Openfire, and the page embeds Flash chat client (Xiff).
The presence service
Whole system user can be divided into MSN user and web user (Xmpp), this status service real time record state of user, and the index of User Status is provided.
The label search engine
4, the relevant information of Business Logic process user.,
5, back-end data provides service layer
Adopt the carrier of Hession as the method far call, the service that provides is: Xiao Ke user's data statistics, user RSS manages, consults the collection of MSN mood, the binding between the user, and the buffer memory of user profile and regular update.
The Chat queue service
Service Notification between the internal system, for example, user's chat request, the NS server is sent to the SB server.User Status is mail to state server.Chat queue adopts ActiveMQ (jms), possesses message queue persistence function, when system crash, can provide the session replication function.
6, web site tags service, externally displayed web page.
The present invention can realize the intercommunication of Web and im instrument effectively, realizes mode user communication by all kinds of means, is implemented in msn, web is last mutual, realizes online user search, supports the instant search of msn and web simultaneously, and the chat of reaching between them is mutual, online user's search.
Description of drawings
Fig. 1 is a kind of service logic figure that realizes the method for online user search by immediate communication tool;
Fig. 2 is a kind of construction module figure that realizes the method for online user search by immediate communication tool
Embodiment
A kind of method that realizes online user search by immediate communication tool by
1, MSN basis communication layers:
Main being responsible for and the intercommunication of msn im instrument supports msn8 to the msn15 agreement.
For the consideration of height concurrency, change more high performance Mina communications framework into bottom is logical;
2, multirobot account management layer
Adopt distributedly, trunking mode makes up account management, is responsible for account's interpolation, and deletion is enabled,
Stop, the online user gathers, the user profile collection.
3, robot session communication layer
Adopt trunking mode to make up the session of user and robot.Be responsible for user's establishment, follow the tracks of the Web chat sessions
The main Xmpp agreement that adopts, server end adopts Openfire, and the page embeds Flash chat client (Xiff).
The presence service
Whole system user can be divided into MSN user and web user (Xmpp), this status service real time record state of user, and the index of User Status is provided.
The label search engine
4, Business Logic, the relevant information of process user.
5, back-end data provides service layer
Adopt the carrier of Hession as the method far call, the service that provides is: Xiao Ke user's data statistics, user RSS manages, consults the collection of MSN mood, the binding between the user, and the buffer memory of user profile and regular update.
The Chat queue service
Service Notification between the internal system, for example, user's chat request, the NS server is sent to the SB server.User Status is mail to state server.Chat queue adopts ActiveMQ (jms), possesses message queue persistence function, when system crash, can provide the session replication function.
6, web site tags service, externally displayed web page.
Above structure is formed (accompanying drawing 2).
When MSN terminal (user) is imported any literal or dodged screen 1, display menu, system is written into user profile 2, search 3 for the user creates session, enters search condition 4, user's inputted search keyword 5, WEB end 6, garrulously 7, system's participle 8, search engine according to word segmentation result search for online connection people 9, user select contact person 10, whether select contact person's spcial character 11, with the contact person who selects 12 (accompanying drawings 1) of chatting.
At WEB end 6, when MSN terminal (after the user imports any literal or dodge screen 1, the interface display menu, session is created for the user by system, be written into user profile 2, at this moment search page 3 and the relevant information of chattering appear in the interface, select to enter search condition 4 by the user, behind user's inputted search keyword 5, system's multiple key participle 8, search engine is searched for online connection people 9 according to word segmentation result, represents the interface and selects contact person 10 by the user, if the contact person is WEB end then enters and the contact person that selects chats 12, if the contact person is the MSN end then selects automatically whether the contact person is spcial character 11, if then chat 12 if not then carrying out 8,9 from newly carrying out system's participle with selected contact person, 10,11 step (accompanying drawing 1).

Claims (7)

1. method that realizes online user search by immediate communication tool, comprise MSN basis communication module, the multirobot account management module, robot session communication module, business logic modules, background data base service module, it is characterized in that the whole system user can be divided into MSN user and web user (Xmpp), adopt distributedly, the trunking mode presence makes up the session of user and robot, and business logic modules is with session results operation background data base service module.Real time record state of user and index.
2. realize the method for online user search according to one kind of claim 1 by immediate communication tool, described MSN basis communication module is characterized in that being responsible for the intercommunication with msn im instrument, supports msn8 one or more in the msn15 agreement.
3. realize the method for online user search according to one kind of claim 1 by immediate communication tool, described multirobot account management module is characterized in that the interpolation to account, and deletion is enabled, and stops, and the online user gathers, the user profile collection.
4. realize the method for online user search according to one kind of claim 1 by immediate communication tool, described robot session communication module is characterized in that the establishment to the user, follows the tracks of.
5. realize the method for online user search according to one kind of claim 1 by immediate communication tool, described user when it is characterized in that for WEB user, adopts the Xmpp agreement, and server end adopts Openfire, and the page embeds Flash chat client (Xiff).
6. realize the method for online user search according to one kind of claim 1 by immediate communication tool, described background data base service module is characterized in that adopting the carrier of Hession as the Business Logic far call.
7. according to claim 1 method that realizes online user search by immediate communication tool, described background data base service module, it is characterized in that user's data is added up, user RSS manages, consults, the collection of MSN mood, binding between the user, the buffer memory of user profile, the regular update of user profile.
CN2009100510963A 2009-05-13 2009-05-13 Method for implementing on-line user search through instant messenger Pending CN101888345A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100510963A CN101888345A (en) 2009-05-13 2009-05-13 Method for implementing on-line user search through instant messenger

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100510963A CN101888345A (en) 2009-05-13 2009-05-13 Method for implementing on-line user search through instant messenger

Publications (1)

Publication Number Publication Date
CN101888345A true CN101888345A (en) 2010-11-17

Family

ID=43074073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100510963A Pending CN101888345A (en) 2009-05-13 2009-05-13 Method for implementing on-line user search through instant messenger

Country Status (1)

Country Link
CN (1) CN101888345A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541898A (en) * 2010-12-20 2012-07-04 腾讯科技(深圳)有限公司 Method and system for searching information in session
CN102711037A (en) * 2012-04-26 2012-10-03 刘尚明 Clustering communication system based on position service and server end
CN103354595A (en) * 2013-07-12 2013-10-16 何建亿 P2P (Point to Point) network camera system realizing zero configuration for users
CN104346396A (en) * 2013-08-05 2015-02-11 腾讯科技(深圳)有限公司 Data processing method, device, terminal and system of instant messaging client
CN106657119A (en) * 2016-12-31 2017-05-10 深圳市愚公科技有限公司 Method and apparatus for managing home service robot
CN106790042A (en) * 2016-12-16 2017-05-31 武汉奥浦信息技术有限公司 SKYPE records anti-deletion system
CN107750368A (en) * 2015-04-17 2018-03-02 斯泰瓦纳托集团股份有限公司 For providing the method and system of online exchange platform for target group

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808983A (en) * 2006-02-15 2006-07-26 阿里巴巴公司 Method and system of implementing instant communication
CN101188577A (en) * 2007-12-29 2008-05-28 腾讯科技(深圳)有限公司 Multi-page instant communication method and system
CN101299761A (en) * 2008-06-02 2008-11-05 国网信息通信有限公司 Method and system for processing service unitedly in abeyance
CN100473064C (en) * 2004-07-08 2009-03-25 腾讯科技(深圳)有限公司 Method for directly alternating information with instant communication system on web page

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100473064C (en) * 2004-07-08 2009-03-25 腾讯科技(深圳)有限公司 Method for directly alternating information with instant communication system on web page
CN1808983A (en) * 2006-02-15 2006-07-26 阿里巴巴公司 Method and system of implementing instant communication
CN101188577A (en) * 2007-12-29 2008-05-28 腾讯科技(深圳)有限公司 Multi-page instant communication method and system
CN101299761A (en) * 2008-06-02 2008-11-05 国网信息通信有限公司 Method and system for processing service unitedly in abeyance

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541898A (en) * 2010-12-20 2012-07-04 腾讯科技(深圳)有限公司 Method and system for searching information in session
CN102711037A (en) * 2012-04-26 2012-10-03 刘尚明 Clustering communication system based on position service and server end
CN103354595A (en) * 2013-07-12 2013-10-16 何建亿 P2P (Point to Point) network camera system realizing zero configuration for users
CN104346396A (en) * 2013-08-05 2015-02-11 腾讯科技(深圳)有限公司 Data processing method, device, terminal and system of instant messaging client
CN104346396B (en) * 2013-08-05 2020-08-25 腾讯科技(深圳)有限公司 Data processing method, device, terminal and system for instant messaging client
CN107750368A (en) * 2015-04-17 2018-03-02 斯泰瓦纳托集团股份有限公司 For providing the method and system of online exchange platform for target group
CN106790042A (en) * 2016-12-16 2017-05-31 武汉奥浦信息技术有限公司 SKYPE records anti-deletion system
CN106790042B (en) * 2016-12-16 2019-11-29 武汉奥浦信息技术有限公司 SKYPE records anti-deletion system
CN106657119A (en) * 2016-12-31 2017-05-10 深圳市愚公科技有限公司 Method and apparatus for managing home service robot

Similar Documents

Publication Publication Date Title
US10261954B2 (en) Optimizing search result snippet selection
CN105677844B (en) A kind of orientation of moving advertising big data pushes and user is across screen recognition methodss
Davies et al. QuizRDF: Search technology for the semantic web
Madhavan et al. Web-scale data integration: You can only afford to pay as you go
Matsuo et al. Polyphonet: an advanced social network extraction system from the web
US8060513B2 (en) Information processing with integrated semantic contexts
RU2343537C2 (en) Computer search with help of associative links
US6321228B1 (en) Internet search system for retrieving selected results from a previous search
US6094649A (en) Keyword searches of structured databases
US8341175B2 (en) Automatically finding contextually related items of a task
US20060218121A1 (en) Method and apparatus for notifying a user of new data entered into an electronic system
EP1251437A2 (en) Information retrieval system
CN101887417A (en) Searching method
CN101888345A (en) Method for implementing on-line user search through instant messenger
Zhao et al. Topic-centric and semantic-aware retrieval system for internet of things
Richards et al. The Archaeology Data Service and the Archaeotools project: faceted classification and natural language processing
JP2010140200A (en) Search result classification device and method using click log
JP4469432B2 (en) INTERNET INFORMATION PROCESSING DEVICE, INTERNET INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM CONTAINING PROGRAM FOR CAUSING COMPUTER TO EXECUTE THE METHOD
JP2000231569A (en) Internet information retrieving device, internet information retrieving method and computer readable recording medium with program making computer execute method recorded therein
Liu et al. A query suggestion method based on random walk and topic concepts
Jiang et al. A personalized search engine model based on RSS User's interest
Kumar et al. Web data mining using xML and agent framework
Qureshi et al. Web Supported Query Taxonomy Classifier
JP2001344246A (en) Method for preparing term table data base and method for retrieving electronic document
Veda et al. Personal information systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20160406

C20 Patent right or utility model deemed to be abandoned or is abandoned