CN103092880B - The method and system of the initial data that labelling is produced by the object in Internet of Things - Google Patents

The method and system of the initial data that labelling is produced by the object in Internet of Things Download PDF

Info

Publication number
CN103092880B
CN103092880B CN201110347155.9A CN201110347155A CN103092880B CN 103092880 B CN103092880 B CN 103092880B CN 201110347155 A CN201110347155 A CN 201110347155A CN 103092880 B CN103092880 B CN 103092880B
Authority
CN
China
Prior art keywords
web message
curve
relevant
address information
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110347155.9A
Other languages
Chinese (zh)
Other versions
CN103092880A (en
Inventor
吴贤
蔡柯柯
张硕
夏立军
姚剑
张俐
苏中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN201110347155.9A priority Critical patent/CN103092880B/en
Priority to DE102012218966.1A priority patent/DE102012218966B4/en
Priority to GB1218783.7A priority patent/GB2496268A/en
Priority to US13/661,628 priority patent/US8983926B2/en
Publication of CN103092880A publication Critical patent/CN103092880A/en
Application granted granted Critical
Publication of CN103092880B publication Critical patent/CN103092880B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The open method and system relating to the initial data that labelling is produced by the object in Internet of Things of the present invention.Described method includes: including: the Web message obtained carries out correlation detection to obtain the Web message relevant to various events;Obtain the address information that described relevant Web message is comprised;The object close with described various events is determined based on the address information obtained;And use at least part of content of described relevant Web message as metadata, labelling by determined by the initial data that produces close to object.The application of the invention so that add the metadata of natural language can to the elusive initial data from various object of the mankind, in order to natural language can be used to carry out retrieving and carrying out data mining.

Description

The method and system of the initial data that labelling is produced by the object in Internet of Things
Technical field
The present invention is open relates to data processing technique, especially, relates to what a kind of labelling was produced by the object in Internet of Things The method and system of initial data.
Background technology
Internet of Things (Internet of Things, IoT) is considered as the most important revolution of the Internet.So-called thing It is each that networking is provided to street, highway, building, water system and household electrical appliance etc. the object of such as sensor device etc exactly Plant on real-world object, linked up by the Internet, and then run specific program, reach remotely to control or realize thing and thing Directly communication.The scope of connecting object is expanded to the various objects real world by Internet of Things from electronic equipment, I.e. by the RF identification (RFID) being equipped on each type objects, sensor, Quick Response Code etc., through interface and wireless network phase Even, it is achieved the communication of people and object and dialogue, it is also possible to realize the communication with each other of object and object and dialogue.Such as, not In remote future, household electrical appliance, hospital equipment, even T-shirt can be networked and accessed on network, just as webpage is with long-range Server is the same.As a result, the object in all real worlds can be monitored by networking and operation, and its action is permissible It is programmed to provide convenient to the mankind.
In Internet of Things, a given event, the sensor how obtaining recording-related information is a problem.Such as, How given inquiry " rear-end collision ", find the photographic head recording this event.This Internet of Things is searched for for Internet of Things, It it is very important application.It is different from current WWW network, builds below IoT search engine existence and challenge:
First, the object in real world has the sum of index magnitude.The Internet object will coding 50 trillion to 100 ten thousand Hundred million objects.Everyone is surrounded by 1000 to 500 objects.For current search engine, googol is negative according to amount It is unable to shoulder.And according to statistics, the search engine Google in 2008 only indexes 1,000,000,000 webpages.
Secondly, the initial data that the various objects in Internet of Things are obtained is likely to be of image, video, audio frequency, numeral number According to the form of sequence, small echo etc., there is no that metadata can be used for describing the semanteme of these initial datas, and computer itself The content of these data files can not be understood.It is, the initial data obtained is difficult to transmit viewpoint and the emotion of the mankind, And the mankind's these initial datas of also indigestion.In the face of abundant initial data, people are but difficult to by natural language relevant Information carries out inquiring about, the relatedness between initial data being carried out excavation etc..
Carry out the technology of profound process currently exist for initial data, but due to the such as sensor in IoT it The total amount of the object of class is huge, so using profound process the such as calculating image technology to extract semantic annotations computationally Can't afford.Additionally, process even with profound level, due to the motility of the application of such as inquiry etc, need to set up Substantial amounts of model processes various application.This realization is also worthless.
Fig. 1 shows the schematic diagram of the problem between the initial data that in prior art, actual application and object produce. As it is shown in figure 1, user uses human language to inquire about sensing data on network.But, even if existing substantial amounts of former Beginning data file, owing to there is huge wide gap between natural language querying and the raw data file of sensor of user, and And raw data file also describes its semanteme almost without metadata, therefore user can not obtain desired Query Result.Cause How this, connect natural language querying and initial data so that carrying out the search of data and excavation and data association Excavation of property etc. is a technical problem present in prior art.
Therefore, prior art need the initial data that labelling is produced by the object in Internet of Things to count further According to the technology processed.
Summary of the invention
In order to solve at least one in the above-mentioned problems in the prior art, and it is open to propose the present invention.According to One embodiment of an aspect disclosed by the invention provides one and utilizes Web message to label to initial data so that former Beginning data have the metadata of its semanteme of description thus help to understand the technical scheme of the content of initial data.
Inventors have seen that the Web message of such as blog and microblogging etc is widely used.Herein In " the Web message " mentioned refer to have the content of transmission on the network of popularity and dependency.So-called " popularity " refers to The content of Web message is varied, thought relating to this or that and the mankind occurred in real world etc., and The user of Web message can use the various equipment of such as mobile terminal or fixed terminal etc to issue Web on network at any time Message.Web message can include text, document, icon, photo, audio frequency, video etc..So-called " dependency " refers to that Web disappears The content of breath is relevant with interested event, and the difference of the issuing time of such as Web message and the time of origin of interested event is in advance In the range of Ding and be all about similar event, then it is assumed that Web message and interested event have dependency.Additionally, for this For invention, Web message is the Web message of the address information having user when sending Web message.
Microblogging is a typical case of Web message.Microblogging is that the brief text that a kind of user of permission upgrades in time is (usual Less than 140 words) and can be with the blog form published.Microblogging service include such as Twitter, Yahoo, Sina, Sohu, 163 etc..
Microblogging is the most flourishing, and has attracted a large number of users.According to the statistical data in April, 2010, as The Twitter of the representative website of microblogging has more than 100 ten thousand registration users and also has the new user of more than 30 ten thousand every day.Every day is average Issuing more than 5,000 5 hundred ten thousand Twitter microbloggings, content is all-embracing.In all these Twitter microbloggings, it is logical more than 37% Cross what mobile device was issued, and the position also major part of its actual issue can be obtained.
(in other words, there is dependency and popularity) and the feature of location aware is commonly used, invention due to Web message People contemplates the semanteme utilizing Web message to enrich sensing data.Specifically, the present invention is by identifying Web message and biography Relation between sensor, then distributes at least some of content of relevant Web message as label to annotate sensing data Semanteme filled and led up the wide gap between the initial data that human intelligible and object obtain, thus solve in prior art and exist Problem.Further, it is possible to use these semantic markers support the search to sensing data and data mining duty and Other application to initial data.
Embodiment disclosed by the invention can be to include that the various ways of method or system is implemented.The present invention is discussed below public The several embodiments opened.
As the method for method of the initial data that a kind of labelling is produced by the object in Internet of Things, disclosed by the invention one Individual embodiment at least includes: the Web message obtained carries out correlation detection to obtain the Web message relevant to various events; Obtain the address information that described relevant Web message is comprised;Determine and described various events based on the address information obtained Close object;And use at least part of content of described relevant Web message as metadata, labelling by determined by connect The initial data that nearly object produces.
As the system of the initial data that a kind of labelling is produced by the object in Internet of Things, an enforcement disclosed by the invention Example at least includes: for the Web message obtained carrying out correlation detection to obtain the dress of the Web message relevant to various events Put;For obtaining the device of the address information that described relevant Web message is comprised;For true based on the address information obtained The device of the fixed object close with described various events;And for using at least part of content of described relevant Web message As metadata, labelling by determined by the device of initial data that produces close to object.
As a kind of method searching for object in Internet of Things, an embodiment disclosed by the invention at least includes: use Natural language input inquiry item;And use described query term, and metadata based on the object in Internet of Things, produce search knot Really;Wherein said metadata makes to produce in aforementioned manners.
As a kind of equipment searching for object in Internet of Things, an embodiment disclosed by the invention at least includes: be used for Use the device of natural language input inquiry item;And be used for using described query term, first number based on the object in Internet of Things According to, produce the device of Search Results;Wherein said metadata is to use said system to produce.
As the search engine used on a kind of network, an embodiment disclosed by the invention at least includes: be used for receiving The module of user's input;Said system;And for retrieving according to user's input and the information produced by described equipment Module.
Accompanying drawing explanation
Accompanying drawing referenced in this explanation is served only for the exemplary embodiments of the example present invention, it should not be assumed that be to the present invention The restriction of scope.
Fig. 1 shows the schematic diagram of the problem between the initial data that in prior art, actual application and object produce.
Fig. 2 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce The flow chart of the method for initial data.
Fig. 3 shows the address letter of the Web message sent out based on each user according to an embodiment of the invention The schematic diagram of the curve that breath use curve matching is obtained.
Fig. 4 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce The block diagram of the system of initial data.
Fig. 5 shows the search realized according to one embodiment of present invention and processes the flow chart of example.
Fig. 6 shows the block diagram of the search engine realized according to one embodiment of present invention.
Detailed description of the invention
In discussion below, it is provided that a large amount of concrete details are to help thoroughly to understand the present invention.It will be apparent, however, that for ability For field technique personnel, even if there is no these details, have no effect on the understanding of the present invention.And it should be appreciated that make Being only used to conveniently describe with following any concrete term, therefore, the present invention should not necessarily be limited to be only used in such art Represented by language and/or in any application-specific of hint.
According to an embodiment disclosed by the invention, it is provided that by identifying between the object in Web message and Internet of Things Relation, then distribute at least some of content of relevant Web message as label with former produced by annotation respective objects The semanteme of beginning data solves at least one problem present in prior art.Further, it is possible to use these semantic marks Note supports the search to sensing data and data mining duty and other application to initial data, such as, uses nature language Initial data inquired about in speech.
It should be noted that term " object " herein refers to produce data and by produced data transmission To the random devices of other object, device, equipment or system.Such as, object can be sensing device, such as RF identification (RFID), reader, Quick Response Code, photographic head, sensor etc., object can also be equipped with RFID, reader, Quick Response Code, take the photograph As the autonomous device of head, sensor etc., the notebook computer such as with RFID, the electric refrigerator with temperature sensor, have The T-shirt etc. of Quick Response Code.
Fig. 2 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce former The process 200 of beginning data.
In step 202, process 200 beginnings.
In step 204, the Web message received is carried out correlation detection to obtain the Web relevant to interested event Message.Step 204 can be realized by more than one filtration step.According to an embodiment disclosed by the invention, can wrap Include two filtration steps:
(1) content-based filtering:
Step 204 can include that information filtering step is to filter out all Web message relevant in content and to abandon other Message.Owing to carrying out labelling object by the information relevant with the event that object is recorded, so content-based filtering can be (such as, modal user's query option list, the list of focus incident, the list of traffic events, the most normal according to default option Lists of keywords etc.), from substantial amounts of Web message, find out the entry of content matching.This can use based on keyword The inverted list technology of coupling realizes.
(2) time-based filtration:
Step 204 can include that temporal filtering step is to filter out time upper relevant all Web message and to abandon other Message.Time-based filtration can include following two step:
2.1 filtrations based on issuing time: it is, only retain the time of origin phase of issuing time and interested event The Web message closed.Temporal filtering step is issuing time and the institute in order to filter out Web message from the Web message received The time that the event being concerned about occurs Web message in the range of the scheduled time, and the time that abandons other Web message repugnant. Such as, the generation event of interested event is about 8:00 in morning on the same day.Temporal filtering step only retains 7:30~8:30 on the same day The Web message issued in this time period.
Exist time range be likely due to issue Web message user be probably movement, he see event send out Time difference is there is between life and his actual issue Web message;It is also likely to be owing to user sees after event through Issue relevant Web message;Or it is also likely to be the time difference caused due to network congestion, wireless network instability etc.. This scheduled time can be default, it is also possible to is arranged by user/system.
2.2 instantaneities filter: on the basis of issuing time filters, reuse instantaneity and filter, thus only protect The Web message describing present situation issued in staying the time range of regulation.Such as, issue after 8:00 in morning on the same day Web message potentially includes such as the content of " XX that yesterday occurs " etc.But, these contents are apparently not the instant letter issued Breath, but outdated information, it should filter out.As " XX just occurred " then belongs to instant messages, it should retain.
Instantaneity filtration step can realize by combining existing participle and sorting technique.According to the present invention one Aspect, it is proposed that a kind of content filtering engine combining existing participle and classification process.For example, first can choose Article 2,000, Web message.Artificially by these Web message category be now, the past, future and other.For each Web message In each sentence, first by its participle.Such as, a Web message only includes that in short " I is seeing a film now.”.Pass through Participle, become " I/now// see/film.”.
Using each participle as feature, build grader based on machine learning algorithm.SVM (Support can be used Vector Machine) algorithm, ME (Maximum Entropy) algorithm etc..So for the Web the most manually marked Message, it is possible to use this grader identifies automatically, label it as now, the past, future and the class in other.Only Having and be labeled as present Web message and remain, remaining is all deleted.However, it is worth noting that the invention is not restricted to Upper processing procedure, but those skilled in the art can be according to the demand of oneself, it is possible to use other participle and sorting technique.
Illustrate and describe each filtration step although above with particular order, but skilled artisan would appreciate that The invention is not restricted to this particular order, but content-based filtering can be performed and based on the time with random order as required Filtration.
In step 206, Web message is carried out address information detection to obtain the address of the Web message comprising address information Information, and abandon the Web message not comprising address information.
It should be noted that not all Web message all includes address information, but the user issuing Web message is permissible Choose whether to disclose its current address information.If user selects to disclose its address information, then the Web message issued just bag Include address information, the most do not include address information.
Address information is typically the form of GPS address date, but by using third party to service, the address of Web message Information is also likely to be the form that word describes, such as " crossing, XX Jie Yu YY street ".Can be connect by the API that Web browser provides Mouth acquisition user issues address information during this message.If but the address information obtained is word to be described, according to the present invention A disclosed embodiment, then need to describe word to be converted into GPS address date.This conversion can use of the prior art Crossover tool, no further details to be given herein.
According to another embodiment of the invention, address information can be filtered out from the content of Web message, the most again will This address information is converted into GPS address date.Such as, Web message is probably " now, the crossing in street, Chongwenmen to Chang'an street There occurs and block up, vehicle low running speed.”.Address information " the road in street, Chongwenmen to Chang'an street can be extracted from this message Mouthful ".In conjunction with existing cartographic information, then this address information can be converted into GPS address date.
In step 208, address information based on the Web message obtained, thing close with interested event in detection IoT Body.
To those skilled in the art, in IoT, the position (such as, GPS address date) of each object is all known 's.The thing relevant to interested event can be determined with the known positional information of object by the address information of Web message Body.Such as, determine that the minimum object of air line distance therebetween is close object.
But, as previously described, because user is probably movement, it is also possible to after seeing event, a period of time just sends Web message and at this moment its position have occurred and that change etc., so user send Web message time location with may note The position of the object of the interested event of record there may be difference.Thus simply by virtue of the address information of one or several Web message It is likely difficult to determine the object higher with interested event nearness with the known positional information of object.
According to an embodiment disclosed by the invention, it is proposed that use existing curve fitting technique to carry out the number from IoT Measure and huge object is determined the object higher with interested event nearness.
According to an embodiment disclosed by the invention, proximity detecting step can include following operation:
First step: extract the address information of the Web message issued from same user from the Web message obtained. Such as, that issues related news may have 100 users, therefrom extracts the Web that same user issues in nearest 6 hours The address information of message.
Second step: for each user, uses the address information of its Web message issued to carry out curve fitting, To obtain its position curve.
Fig. 3 shows the address letter of the Web message sent out based on each user according to an embodiment of the invention The schematic diagram of the curve that breath use curve matching is obtained.As it is shown on figure 3, open circles represents an address of a Web message Information, each curve is address information institute based on the Web message from same user matching curve out.In figure 3, Filled circles represents the object in IoT.Although Fig. 3 illustrate only an object, but the invention is not restricted to this, such as front institute Stating, the quantity of object can be much more, and it can be selected by those skilled in the art as required.
Third step: the distance relation between position data based on object and each curve, determines close object.
Below equation can be used to determine the distance relation between the position data of object and each curve: be designated as by thing x1, x2... xM, curve table is shown as D1, D2..., DN,
arg mini(maxj(distance(xi, Dj)))
Wherein distance (xi, Dj) represent the i-th object beeline to j-th strip matched curve, wherein, i represents I object, its be from 1 to M integer value, M is the sum close to object that user is selected as required;J represents jth Individual matched curve, the integer value wherein j is from 1 to N, N is the sum of the curve obtained by curve matching;Max generation Table takes the function of maximum, and Min represents the function taking minima.
Use above-mentioned formula, choose object to the ultimate range in the distance of each curve as this object feature away from From, then choose the minimum object of characteristic distance in all objects as object immediate with interested event.Further Ground, can come to be ranked up corresponding object according to this feature distance, to represent each object and interested thing from small to large The degree of tapping into of part.
Such as, also as a example by Fig. 3, the result of curve matching is, address information based on user A simulates two curves 1 With 2, address information based on user B simulates a curve 3.Assume to there are multiple object.Wherein each object to three Ultimate range in the distance of curve is respectively 5,3,5,6,9,8.......Then choose the thing of the minima 3 with ultimate range Body is as closest to object, as shown in Figure 3.
The largest benefit of this method is formula argmini(maxj(distance(xi, Dj))) be in the prior art Simply and it is standardized, and the instrument realizing it can be readily available.
Certainly, the invention is not restricted to this, those skilled in the art can also use other range formula according to its demand.Can To use such as average distance minima, it is, use an object special as it to the meansigma methods of the distance of each curve Levy distance, and the object of selected characteristic distance minimum is as immediate object.Can also use such as ultimate range square Minima, it is, use object to the square value of the ultimate range in the distance of each curve as its characteristic distance, And the object of selected characteristic distance minimum is as immediate object.
In step 210, utilize at least some of of Web message, to determined by mark close to the initial data of object Note.
Such as, user when JIUYUE in 2011 23 days 7 56 distribute one Web message of cloth for " see four cars to knock into the back, The most so awful!", and the photographic head that immediate photographic head is western entrance, Xin Jie Kou.Then can with " the knocking into the back " in Web message and time Between " 2011/9/237:56 ", be used as first number of the raw data file vsd.vso that the photographic head at western entrance, Xin Jie Kou is obtained According to, it is marked.
Further, can be ranked up for each close photographic head, such as, generate and include following content Web page:
Knock into the back 2011/9/23 7:56 Xin Jie Kou western entrance vsd.vso
East, West Street, Xin Jie Kou mouth vsf.vso
West Street, Xin Jie Kou western entrance vsg.vso
User can click on the viewing of corresponding video file.Can also pass through natural language " knock into the back ", " JIUYUE 23 in 2011 Day " etc. carry out data retrieval.
In step 212, process 200 end.
As it was previously stated, the quantity of Web message is exponential.If execution each time processes 200, start in step 204 Be all Web message on network are processed if, then the time needed for process and calculate cost bigger.
According to one embodiment of present invention, a pre-treatment step can be included between step 202 and step 204. Described pre-treatment step can use existing index technology to index all Web message issued on network in real time, then The Web message the most relevant to interested event is taken out in step 204 based on index.
For example, it is possible to use participle technique carries out participle in real time to each Web message, according to pre-build Keywords database, determines whether occur at least one key word in Web message, then, sets up the Web message a certain key word occur And chain between this key word fetches and is indexed in keywords database.
It is also as a example by " rear-end collision " by Web message, is " automobile/knock into the back/" by this message participle.So use " vapour Car ", " knocking into the back ", as index terms, builds inverted list, by search " automobile " or " knocking into the back ", can obtain this message.
Then, this link is used to extract the Web message relevant to key word for entering rapidly in step 204 One step processes.
Additionally, although Fig. 2 employs arrow line to indicate each step, but the invention is not restricted to this, but permissible Perform each step in Fig. 2 in other sequences.Such as, step 204 and 206 execution sequence can be contrary.
Fig. 4 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce The block diagram of the system 400 of initial data.
System 400 according to an embodiment of the invention includes Web message search engine 401, correlation detector 407, address information detector 409, proximity detector 411 and marker 411.Correlation detector 407 includes information filtering Device 403 and time filter 405.
Web message search engine 401 is optional, its be not realize essential to the invention.Web message search engine The 401 all Web message issued on index network in real time.
Correlation detector 407 is for detecting the Web message relevant to various events.Content filter 403 is used for filtering Go out the Web message that content is relevant to various events.Time filter 405 is for filtering out the generation of issuing time and various events Time Web message in preset range, and carry out what instant row filtration was issued in the time range of regulation with acquisition The Web message of present situation is described.Other Web message will be dropped.
Address information detector 409 receives the related Web message from correlation detector 407, and extracts these Web and disappear Address information in breath.Address information can be to use API from Web message extraction, it is also possible to is from the content of Web message Filter out.Address information can have gps data form or text formatting.Address information detector 409 can include one Individual transducer (not shown), for changing the form of described address information, such as, is converted into gps data form from text formatting.
Proximity detector 411 for based on the address information from address information detector 409, determines and is occurred The immediate object of event.Specific embodiment has been carried out above describing in detail, is not repeated at this.
Marker 413 for based on corresponding Web message, come labelling from determined by the original number of immediate object According to.
According to one embodiment of present invention, the result of labelling can be issued with forms such as webpage, document, texts, for Further process.Such as, search engine can use the result of this labelling to scan for, in order to for using natural language to carry out The user of inquiry provides relevant Query Result rapidly.
Fig. 5 shows the search realized according to one embodiment of present invention and processes the flow chart of example.Shown in Fig. 5 It it is a present invention application in inquiry.
As it is shown in figure 5, user can use " knocking into the back " to inquire about occurred rear crash event.Content filter 403 is found out " knock into the back " webpage linked provide these webpages relevant with the querying condition of user in terms of content with key word.Temporal filtering Device 405 filters out not all Web message in the range of required time, and processes remaining Web message.Temporal filtering Device 405 is additionally based upon the content of Web message and carries out instant row filtration, to filter out Web message incoherent with present situation.Example Such as the rear crash event of today that, user needs, therefore, including " yesterday .... knock into the back " or " long ago .... knock into the back " Web Message is not of interest, thus removes these message.
Address information detector 409 obtains address information therein from remaining Web message.As it has been described above, in IoT The positional information of object is known, is pre-stored in data base.Proximity detector 411 detects relevant to event of interest Object.Marker 411 use Web message come at least partially to be marked each object to show that each object obtains The semanteme of initial data.By using described labelling, the inquiry of natural language can be associated with initial data, thus Provide the user with such as: " Query Result of return: the most or monitored the photographic head of " knocking into the back ", user can connect Receive this photographic head and browse its data ".
Certainly, user is also based on labelling and excavates the relatedness between initial data.For example, it is possible to find out with Once knock into the back relevant all photographic head, in order to obtains the data relevant with the generating process that this knocks into the back.
Fig. 6 shows the block diagram of the search engine realized according to one embodiment of present invention.This shown in Fig. 6 The object lesson of a bright realization.
As shown in Figure 6, search engine includes the system 400 described in Fig. 4.Additionally, search engine to be used for receiving user defeated The module 601 entered and the information for inputting according to user and produced by system 400 carry out the module 602 retrieved.Then, institute Obtain retrieval result to be returned to inquire about user.
Above the basic thought of the present invention is described, skilled artisan would appreciate that: the invention provides One or more in advantages below:
Web message and IoT can be combined to provide intelligible IoT.
-distribution Web message gives relevant " object "
The observation of-use metadata object
--it is with natural language rather than with quantitative data, image, video etc.;
--its transmission emotion viewpoint rather than neutral-data;
--the different viewpoints of its reflection different people.
Abundant " object " by Web message
-identify the relation between instant microblogging notice and " object "
-distribute these notice as label to " object "
-support the search to object and data mining duty
--user can scan for natural language querying
--the microblogging notice that retrieval is relevant
Person of ordinary skill in the field knows, the present invention can be presented as system, method or computer program. Therefore, the present invention can be to be implemented as following form, i.e. can be that hardware, completely software (include firmware, stay completely Stay software, microcode etc.) or the referred to generally herein as software section of " circuit ", " module " or " system " and hardware components Combination.It is also possible to take to be embodied in any tangible expression medium (medium of expression) The form of computer program, this medium comprises computer can procedure code.
Can use one or more computer can or any combination of computer-readable medium.Computer can be used Or computer-readable medium can be such as--but being not limited to--electricity, magnetic, light, electromagnetism, ultrared or half The system of conductor, device, device or propagation medium.
Referring to method, device (system) and computer program according to the embodiment of the present invention flow chart and/ Or block diagram describes the present invention.It is clear that it is each in flow chart and/or each square frame of block diagram and flow chart and/or block diagram The combination of square frame, can be realized by computer program instructions.These computer program instructions can be supplied to general purpose computer, Special-purpose computer or the processor of other programmable data processing means, thus produce a kind of machine so that pass through computer Or these instructions that other programmable data processing means performs, produce in the square frame in flowchart and/or block diagram and specify The device of function/operation.
Flow chart in accompanying drawing and block diagram, it is illustrated that according to system, method and the computer journey of various embodiments of the invention Architectural framework in the cards, function and the operation of sequence product.In this, each square frame in flow chart or block diagram can generation One module of table, program segment or a part for code, a part for described module, program segment or code comprises one or more For realizing the executable instruction of the logic function of regulation.It should also be noted that some as replace realization in, institute in square frame The function of mark can also occur to be different from the order marked in accompanying drawing.Such as, the square frame that two succeedingly represent is actual On can perform substantially in parallel, they can also perform sometimes in the opposite order, and this is depending on involved function.
The device (means) of counter structure, material, operation and all function limitations in claim or step Equivalent, it is intended to include any performing this function for combined with other unit specifically noted in the claims Structure, material or operation.The given description of this invention its object is to signal and describes, and is not exhaustive, also It is not intended to the present invention to be limited to stated form.For person of an ordinary skill in the technical field, the most inclined In the case of scope and spirit of the present invention, it is clear that may be made that many amendments and modification.Selection and explanation to embodiment, be In order to explain the principle of the present invention and actual application best, person of an ordinary skill in the technical field is enable to understand, this Invention can have the various embodiments with various change of applicable desired special-purpose.

Claims (17)

1. a method for the initial data that labelling is produced by the object in Internet of Things, including:
The Web message obtained is carried out correlation detection to obtain the Web message relevant to various events, described Web message User uses various equipment to issue described Web message on network at any time;
Obtain the address information that described relevant Web message is comprised;
The object close with described various events is determined based on the address information obtained;And
Use at least part of content of described relevant Web message as metadata, labelling by determined by produce close to object Initial data.
Method the most according to claim 1, wherein, described determines and described various events based on the address information obtained The step of close object includes:
Obtain and same user-dependent address information from described relevant Web message;
Use curve matching to generate the curve of matching based on described acquired address information;And
Positional information based on the object in Internet of Things and the curve of institute's matching, determine the nearness of described object.
Method the most according to claim 2, wherein, according to the curve of positional information and the matching of described institute of each object Minima in Ju Li or according to each object positional information and the ultimate range of the curve of described institute matching minima, According to each object positional information and the matching of described institute curve average distance minima or according to each object Positional information and the minima of the square value of the ultimate range of the curve of described institute matching, determine that each object is with of interest The nearness of event.
Method the most according to claim 1, also includes:
The Web message occurred on network is indexed in real time;And
The all Web message relevant with the interested event in described various events are retrieved from the Web message after index.
Method the most according to claim 1, wherein, uses the issuing time of described relevant Web message and with interested The relevant word of event, produce the metadata of the initial data produced for labelling by close object.
Method the most according to claim 5, wherein, based on described metadata, to the inquiry using natural language to carry out Respond.
Method the most according to claim 2, also includes:
According to the degree of tapping into of each object, each object described is ranked up.
8. a system for the initial data that labelling is produced by the object in Internet of Things, including:
For the Web message obtained being carried out correlation detection to obtain the device of the Web message relevant to various events, described The user of Web message uses various equipment to issue described Web message on network at any time;
For obtaining the device of the address information that described relevant Web message is comprised;
For determining the device of the object close with described various events based on the address information obtained;And
For use at least part of content of described relevant Web message as metadata, labelling by determined by close to object The device of the initial data produced.
System the most according to claim 8, wherein, described various with described for determining based on the address information obtained The device of the object that event is close includes:
For obtaining the device with same user-dependent address information from described relevant Web message;
For using curve matching to generate the device of the curve of matching based on described acquired address information;And
For positional information based on the object in Internet of Things and the curve of institute's matching, determine the dress of the nearness of described object Put.
System the most according to claim 9, wherein, according to positional information and the curve of described institute matching of each object Distance in minima or the minimum of ultimate range of curve of positional information and the matching of described institute according to each object Value or according to each object positional information and the matching of described institute curve average distance minima or according to each thing The positional information of body and the minima of the square value of the ultimate range of the curve of described institute matching, determine each object and closed The nearness of the event of the heart.
11. systems according to claim 8, also include:
For the device that the Web message occurred on network is indexed in real time;And
Disappear for retrieving all Web relevant with the interested event in described various events from the Web message after index The device of breath.
12. systems according to claim 8, wherein, use the issuing time of described relevant Web message and with interested The relevant word of event, produce the metadata of the initial data produced for labelling by close object.
13. systems according to claim 12, wherein, based on described metadata, to looking into that use natural language is carried out Inquiry responds.
14. systems according to claim 9, also include:
For the degree of tapping into according to each object, the device that each object described is ranked up.
15. 1 kinds of methods searching for object in Internet of Things, including:
Use natural language input inquiry item;And
Use described query term, metadata based on the object in Internet of Things, produce Search Results;
Wherein said metadata is to use the method according to any one of claim 1-7 to produce.
16. 1 kinds of equipment searching for object in Internet of Things, including:
For using the device of natural language input inquiry item;And
For using described query term, metadata based on the object in Internet of Things, produce the device of Search Results;
Wherein said metadata is to use the equipment according to any one of claim 8-14 to produce.
The search engine used on 17. 1 kinds of networks, including:
For receiving the module of user's input;
Equipment as according to any one of claim 8-14;And
Information for inputting according to user and produced by described equipment carries out the module retrieved.
CN201110347155.9A 2011-10-31 2011-10-31 The method and system of the initial data that labelling is produced by the object in Internet of Things Active CN103092880B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201110347155.9A CN103092880B (en) 2011-10-31 The method and system of the initial data that labelling is produced by the object in Internet of Things
DE102012218966.1A DE102012218966B4 (en) 2011-10-31 2012-10-18 Method and system for identifying original data generated by things in the Internet of Things
GB1218783.7A GB2496268A (en) 2011-10-31 2012-10-19 Tagging original data generated in the internet of things
US13/661,628 US8983926B2 (en) 2011-10-31 2012-10-26 Method and system for tagging original data generated by things in the internet of things

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110347155.9A CN103092880B (en) 2011-10-31 The method and system of the initial data that labelling is produced by the object in Internet of Things

Publications (2)

Publication Number Publication Date
CN103092880A CN103092880A (en) 2013-05-08
CN103092880B true CN103092880B (en) 2016-12-14

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040078750A1 (en) * 2002-08-05 2004-04-22 Metacarta, Inc. Desktop client interaction with a geographical text search system
CN101675429A (en) * 2007-01-31 2010-03-17 名誉捍卫者公司 Identifying and changing personal information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040078750A1 (en) * 2002-08-05 2004-04-22 Metacarta, Inc. Desktop client interaction with a geographical text search system
CN101675429A (en) * 2007-01-31 2010-03-17 名誉捍卫者公司 Identifying and changing personal information

Similar Documents

Publication Publication Date Title
US8983926B2 (en) Method and system for tagging original data generated by things in the internet of things
CN108268582B (en) Information query method and device
CN105706080B (en) Augmenting and presenting captured data
CN105095211B (en) The acquisition methods and device of multi-medium data
US20150112963A1 (en) Time and location based information search and discovery
US9990368B2 (en) System and method for automatic generation of information-rich content from multiple microblogs, each microblog containing only sparse information
US8972498B2 (en) Mobile-based realtime location-sensitive social event engine
US9053194B2 (en) Method and apparatus for correlating and viewing disparate data
CN103577549A (en) Crowd portrayal system and method based on microblog label
CN106484764A (en) User's similarity calculating method based on crowd portrayal technology
WO2014197216A1 (en) Photo and video search
CN106383887A (en) Environment-friendly news data acquisition and recommendation display method and system
CN104536956A (en) A Microblog platform based event visualization method and system
KR101462348B1 (en) System and method for matching users having matter of common interest and change of talent using tag applicable to mobile messenger
KR101754371B1 (en) Method for providing SNS contents attached tag
CN103399855B (en) Behavior intention determining method and device based on multiple data sources
JP5725619B2 (en) Apparatus, program, and method for tagging position information with keywords based on a large number of comment sentences
KR101290325B1 (en) Apparatus and method for searching personalized contents of a traver destination based on user position log of a mobile terminal
US20210117467A1 (en) Systems and methods for filtering of computer vision generated tags using natural language processing
KR20100071359A (en) Apparatus and method for information search on the basis of tag and method for tag management
CN109857869A (en) A kind of hot topic prediction technique based on Ap increment cluster and network primitive
Kim et al. TwitterTrends: a spatio-temporal trend detection and related keywords recommendation scheme
US20140032675A1 (en) Method and system for pushing associated users in social networking service network
CN106777395A (en) A kind of topic based on community's text data finds system
KR20180087772A (en) Method for clustering and sharing images, and system and application implementing the same method

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant