CN103092880B - The method and system of the initial data that labelling is produced by the object in Internet of Things - Google Patents
The method and system of the initial data that labelling is produced by the object in Internet of Things Download PDFInfo
- Publication number
- CN103092880B CN103092880B CN201110347155.9A CN201110347155A CN103092880B CN 103092880 B CN103092880 B CN 103092880B CN 201110347155 A CN201110347155 A CN 201110347155A CN 103092880 B CN103092880 B CN 103092880B
- Authority
- CN
- China
- Prior art keywords
- web message
- curve
- relevant
- address information
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000002372 labelling Methods 0.000 title claims abstract description 26
- 238000001514 detection method Methods 0.000 claims abstract description 8
- 238000010079 rubber tapping Methods 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims 2
- 238000007418 data mining Methods 0.000 abstract description 4
- 238000001914 filtration Methods 0.000 description 23
- 230000008569 process Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 4
- 238000009412 basement excavation Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000003203 everyday effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 238000002203 pretreatment Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 1
- 244000097202 Rathbunia alamosensis Species 0.000 description 1
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 201000006549 dyspepsia Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Abstract
The open method and system relating to the initial data that labelling is produced by the object in Internet of Things of the present invention.Described method includes: including: the Web message obtained carries out correlation detection to obtain the Web message relevant to various events;Obtain the address information that described relevant Web message is comprised;The object close with described various events is determined based on the address information obtained;And use at least part of content of described relevant Web message as metadata, labelling by determined by the initial data that produces close to object.The application of the invention so that add the metadata of natural language can to the elusive initial data from various object of the mankind, in order to natural language can be used to carry out retrieving and carrying out data mining.
Description
Technical field
The present invention is open relates to data processing technique, especially, relates to what a kind of labelling was produced by the object in Internet of Things
The method and system of initial data.
Background technology
Internet of Things (Internet of Things, IoT) is considered as the most important revolution of the Internet.So-called thing
It is each that networking is provided to street, highway, building, water system and household electrical appliance etc. the object of such as sensor device etc exactly
Plant on real-world object, linked up by the Internet, and then run specific program, reach remotely to control or realize thing and thing
Directly communication.The scope of connecting object is expanded to the various objects real world by Internet of Things from electronic equipment,
I.e. by the RF identification (RFID) being equipped on each type objects, sensor, Quick Response Code etc., through interface and wireless network phase
Even, it is achieved the communication of people and object and dialogue, it is also possible to realize the communication with each other of object and object and dialogue.Such as, not
In remote future, household electrical appliance, hospital equipment, even T-shirt can be networked and accessed on network, just as webpage is with long-range
Server is the same.As a result, the object in all real worlds can be monitored by networking and operation, and its action is permissible
It is programmed to provide convenient to the mankind.
In Internet of Things, a given event, the sensor how obtaining recording-related information is a problem.Such as,
How given inquiry " rear-end collision ", find the photographic head recording this event.This Internet of Things is searched for for Internet of Things,
It it is very important application.It is different from current WWW network, builds below IoT search engine existence and challenge:
First, the object in real world has the sum of index magnitude.The Internet object will coding 50 trillion to 100 ten thousand
Hundred million objects.Everyone is surrounded by 1000 to 500 objects.For current search engine, googol is negative according to amount
It is unable to shoulder.And according to statistics, the search engine Google in 2008 only indexes 1,000,000,000 webpages.
Secondly, the initial data that the various objects in Internet of Things are obtained is likely to be of image, video, audio frequency, numeral number
According to the form of sequence, small echo etc., there is no that metadata can be used for describing the semanteme of these initial datas, and computer itself
The content of these data files can not be understood.It is, the initial data obtained is difficult to transmit viewpoint and the emotion of the mankind,
And the mankind's these initial datas of also indigestion.In the face of abundant initial data, people are but difficult to by natural language relevant
Information carries out inquiring about, the relatedness between initial data being carried out excavation etc..
Carry out the technology of profound process currently exist for initial data, but due to the such as sensor in IoT it
The total amount of the object of class is huge, so using profound process the such as calculating image technology to extract semantic annotations computationally
Can't afford.Additionally, process even with profound level, due to the motility of the application of such as inquiry etc, need to set up
Substantial amounts of model processes various application.This realization is also worthless.
Fig. 1 shows the schematic diagram of the problem between the initial data that in prior art, actual application and object produce.
As it is shown in figure 1, user uses human language to inquire about sensing data on network.But, even if existing substantial amounts of former
Beginning data file, owing to there is huge wide gap between natural language querying and the raw data file of sensor of user, and
And raw data file also describes its semanteme almost without metadata, therefore user can not obtain desired Query Result.Cause
How this, connect natural language querying and initial data so that carrying out the search of data and excavation and data association
Excavation of property etc. is a technical problem present in prior art.
Therefore, prior art need the initial data that labelling is produced by the object in Internet of Things to count further
According to the technology processed.
Summary of the invention
In order to solve at least one in the above-mentioned problems in the prior art, and it is open to propose the present invention.According to
One embodiment of an aspect disclosed by the invention provides one and utilizes Web message to label to initial data so that former
Beginning data have the metadata of its semanteme of description thus help to understand the technical scheme of the content of initial data.
Inventors have seen that the Web message of such as blog and microblogging etc is widely used.Herein
In " the Web message " mentioned refer to have the content of transmission on the network of popularity and dependency.So-called " popularity " refers to
The content of Web message is varied, thought relating to this or that and the mankind occurred in real world etc., and
The user of Web message can use the various equipment of such as mobile terminal or fixed terminal etc to issue Web on network at any time
Message.Web message can include text, document, icon, photo, audio frequency, video etc..So-called " dependency " refers to that Web disappears
The content of breath is relevant with interested event, and the difference of the issuing time of such as Web message and the time of origin of interested event is in advance
In the range of Ding and be all about similar event, then it is assumed that Web message and interested event have dependency.Additionally, for this
For invention, Web message is the Web message of the address information having user when sending Web message.
Microblogging is a typical case of Web message.Microblogging is that the brief text that a kind of user of permission upgrades in time is (usual
Less than 140 words) and can be with the blog form published.Microblogging service include such as Twitter, Yahoo, Sina, Sohu,
163 etc..
Microblogging is the most flourishing, and has attracted a large number of users.According to the statistical data in April, 2010, as
The Twitter of the representative website of microblogging has more than 100 ten thousand registration users and also has the new user of more than 30 ten thousand every day.Every day is average
Issuing more than 5,000 5 hundred ten thousand Twitter microbloggings, content is all-embracing.In all these Twitter microbloggings, it is logical more than 37%
Cross what mobile device was issued, and the position also major part of its actual issue can be obtained.
(in other words, there is dependency and popularity) and the feature of location aware is commonly used, invention due to Web message
People contemplates the semanteme utilizing Web message to enrich sensing data.Specifically, the present invention is by identifying Web message and biography
Relation between sensor, then distributes at least some of content of relevant Web message as label to annotate sensing data
Semanteme filled and led up the wide gap between the initial data that human intelligible and object obtain, thus solve in prior art and exist
Problem.Further, it is possible to use these semantic markers support the search to sensing data and data mining duty and
Other application to initial data.
Embodiment disclosed by the invention can be to include that the various ways of method or system is implemented.The present invention is discussed below public
The several embodiments opened.
As the method for method of the initial data that a kind of labelling is produced by the object in Internet of Things, disclosed by the invention one
Individual embodiment at least includes: the Web message obtained carries out correlation detection to obtain the Web message relevant to various events;
Obtain the address information that described relevant Web message is comprised;Determine and described various events based on the address information obtained
Close object;And use at least part of content of described relevant Web message as metadata, labelling by determined by connect
The initial data that nearly object produces.
As the system of the initial data that a kind of labelling is produced by the object in Internet of Things, an enforcement disclosed by the invention
Example at least includes: for the Web message obtained carrying out correlation detection to obtain the dress of the Web message relevant to various events
Put;For obtaining the device of the address information that described relevant Web message is comprised;For true based on the address information obtained
The device of the fixed object close with described various events;And for using at least part of content of described relevant Web message
As metadata, labelling by determined by the device of initial data that produces close to object.
As a kind of method searching for object in Internet of Things, an embodiment disclosed by the invention at least includes: use
Natural language input inquiry item;And use described query term, and metadata based on the object in Internet of Things, produce search knot
Really;Wherein said metadata makes to produce in aforementioned manners.
As a kind of equipment searching for object in Internet of Things, an embodiment disclosed by the invention at least includes: be used for
Use the device of natural language input inquiry item;And be used for using described query term, first number based on the object in Internet of Things
According to, produce the device of Search Results;Wherein said metadata is to use said system to produce.
As the search engine used on a kind of network, an embodiment disclosed by the invention at least includes: be used for receiving
The module of user's input;Said system;And for retrieving according to user's input and the information produced by described equipment
Module.
Accompanying drawing explanation
Accompanying drawing referenced in this explanation is served only for the exemplary embodiments of the example present invention, it should not be assumed that be to the present invention
The restriction of scope.
Fig. 1 shows the schematic diagram of the problem between the initial data that in prior art, actual application and object produce.
Fig. 2 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce
The flow chart of the method for initial data.
Fig. 3 shows the address letter of the Web message sent out based on each user according to an embodiment of the invention
The schematic diagram of the curve that breath use curve matching is obtained.
Fig. 4 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce
The block diagram of the system of initial data.
Fig. 5 shows the search realized according to one embodiment of present invention and processes the flow chart of example.
Fig. 6 shows the block diagram of the search engine realized according to one embodiment of present invention.
Detailed description of the invention
In discussion below, it is provided that a large amount of concrete details are to help thoroughly to understand the present invention.It will be apparent, however, that for ability
For field technique personnel, even if there is no these details, have no effect on the understanding of the present invention.And it should be appreciated that make
Being only used to conveniently describe with following any concrete term, therefore, the present invention should not necessarily be limited to be only used in such art
Represented by language and/or in any application-specific of hint.
According to an embodiment disclosed by the invention, it is provided that by identifying between the object in Web message and Internet of Things
Relation, then distribute at least some of content of relevant Web message as label with former produced by annotation respective objects
The semanteme of beginning data solves at least one problem present in prior art.Further, it is possible to use these semantic marks
Note supports the search to sensing data and data mining duty and other application to initial data, such as, uses nature language
Initial data inquired about in speech.
It should be noted that term " object " herein refers to produce data and by produced data transmission
To the random devices of other object, device, equipment or system.Such as, object can be sensing device, such as RF identification
(RFID), reader, Quick Response Code, photographic head, sensor etc., object can also be equipped with RFID, reader, Quick Response Code, take the photograph
As the autonomous device of head, sensor etc., the notebook computer such as with RFID, the electric refrigerator with temperature sensor, have
The T-shirt etc. of Quick Response Code.
Fig. 2 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce former
The process 200 of beginning data.
In step 202, process 200 beginnings.
In step 204, the Web message received is carried out correlation detection to obtain the Web relevant to interested event
Message.Step 204 can be realized by more than one filtration step.According to an embodiment disclosed by the invention, can wrap
Include two filtration steps:
(1) content-based filtering:
Step 204 can include that information filtering step is to filter out all Web message relevant in content and to abandon other
Message.Owing to carrying out labelling object by the information relevant with the event that object is recorded, so content-based filtering can be
(such as, modal user's query option list, the list of focus incident, the list of traffic events, the most normal according to default option
Lists of keywords etc.), from substantial amounts of Web message, find out the entry of content matching.This can use based on keyword
The inverted list technology of coupling realizes.
(2) time-based filtration:
Step 204 can include that temporal filtering step is to filter out time upper relevant all Web message and to abandon other
Message.Time-based filtration can include following two step:
2.1 filtrations based on issuing time: it is, only retain the time of origin phase of issuing time and interested event
The Web message closed.Temporal filtering step is issuing time and the institute in order to filter out Web message from the Web message received
The time that the event being concerned about occurs Web message in the range of the scheduled time, and the time that abandons other Web message repugnant.
Such as, the generation event of interested event is about 8:00 in morning on the same day.Temporal filtering step only retains 7:30~8:30 on the same day
The Web message issued in this time period.
Exist time range be likely due to issue Web message user be probably movement, he see event send out
Time difference is there is between life and his actual issue Web message;It is also likely to be owing to user sees after event through
Issue relevant Web message;Or it is also likely to be the time difference caused due to network congestion, wireless network instability etc..
This scheduled time can be default, it is also possible to is arranged by user/system.
2.2 instantaneities filter: on the basis of issuing time filters, reuse instantaneity and filter, thus only protect
The Web message describing present situation issued in staying the time range of regulation.Such as, issue after 8:00 in morning on the same day
Web message potentially includes such as the content of " XX that yesterday occurs " etc.But, these contents are apparently not the instant letter issued
Breath, but outdated information, it should filter out.As " XX just occurred " then belongs to instant messages, it should retain.
Instantaneity filtration step can realize by combining existing participle and sorting technique.According to the present invention one
Aspect, it is proposed that a kind of content filtering engine combining existing participle and classification process.For example, first can choose
Article 2,000, Web message.Artificially by these Web message category be now, the past, future and other.For each Web message
In each sentence, first by its participle.Such as, a Web message only includes that in short " I is seeing a film now.”.Pass through
Participle, become " I/now// see/film.”.
Using each participle as feature, build grader based on machine learning algorithm.SVM (Support can be used
Vector Machine) algorithm, ME (Maximum Entropy) algorithm etc..So for the Web the most manually marked
Message, it is possible to use this grader identifies automatically, label it as now, the past, future and the class in other.Only
Having and be labeled as present Web message and remain, remaining is all deleted.However, it is worth noting that the invention is not restricted to
Upper processing procedure, but those skilled in the art can be according to the demand of oneself, it is possible to use other participle and sorting technique.
Illustrate and describe each filtration step although above with particular order, but skilled artisan would appreciate that
The invention is not restricted to this particular order, but content-based filtering can be performed and based on the time with random order as required
Filtration.
In step 206, Web message is carried out address information detection to obtain the address of the Web message comprising address information
Information, and abandon the Web message not comprising address information.
It should be noted that not all Web message all includes address information, but the user issuing Web message is permissible
Choose whether to disclose its current address information.If user selects to disclose its address information, then the Web message issued just bag
Include address information, the most do not include address information.
Address information is typically the form of GPS address date, but by using third party to service, the address of Web message
Information is also likely to be the form that word describes, such as " crossing, XX Jie Yu YY street ".Can be connect by the API that Web browser provides
Mouth acquisition user issues address information during this message.If but the address information obtained is word to be described, according to the present invention
A disclosed embodiment, then need to describe word to be converted into GPS address date.This conversion can use of the prior art
Crossover tool, no further details to be given herein.
According to another embodiment of the invention, address information can be filtered out from the content of Web message, the most again will
This address information is converted into GPS address date.Such as, Web message is probably " now, the crossing in street, Chongwenmen to Chang'an street
There occurs and block up, vehicle low running speed.”.Address information " the road in street, Chongwenmen to Chang'an street can be extracted from this message
Mouthful ".In conjunction with existing cartographic information, then this address information can be converted into GPS address date.
In step 208, address information based on the Web message obtained, thing close with interested event in detection IoT
Body.
To those skilled in the art, in IoT, the position (such as, GPS address date) of each object is all known
's.The thing relevant to interested event can be determined with the known positional information of object by the address information of Web message
Body.Such as, determine that the minimum object of air line distance therebetween is close object.
But, as previously described, because user is probably movement, it is also possible to after seeing event, a period of time just sends
Web message and at this moment its position have occurred and that change etc., so user send Web message time location with may note
The position of the object of the interested event of record there may be difference.Thus simply by virtue of the address information of one or several Web message
It is likely difficult to determine the object higher with interested event nearness with the known positional information of object.
According to an embodiment disclosed by the invention, it is proposed that use existing curve fitting technique to carry out the number from IoT
Measure and huge object is determined the object higher with interested event nearness.
According to an embodiment disclosed by the invention, proximity detecting step can include following operation:
First step: extract the address information of the Web message issued from same user from the Web message obtained.
Such as, that issues related news may have 100 users, therefrom extracts the Web that same user issues in nearest 6 hours
The address information of message.
Second step: for each user, uses the address information of its Web message issued to carry out curve fitting,
To obtain its position curve.
Fig. 3 shows the address letter of the Web message sent out based on each user according to an embodiment of the invention
The schematic diagram of the curve that breath use curve matching is obtained.As it is shown on figure 3, open circles represents an address of a Web message
Information, each curve is address information institute based on the Web message from same user matching curve out.In figure 3,
Filled circles represents the object in IoT.Although Fig. 3 illustrate only an object, but the invention is not restricted to this, such as front institute
Stating, the quantity of object can be much more, and it can be selected by those skilled in the art as required.
Third step: the distance relation between position data based on object and each curve, determines close object.
Below equation can be used to determine the distance relation between the position data of object and each curve: be designated as by thing
x1, x2... xM, curve table is shown as D1, D2..., DN,
arg mini(maxj(distance(xi, Dj)))
Wherein distance (xi, Dj) represent the i-th object beeline to j-th strip matched curve, wherein, i represents
I object, its be from 1 to M integer value, M is the sum close to object that user is selected as required;J represents jth
Individual matched curve, the integer value wherein j is from 1 to N, N is the sum of the curve obtained by curve matching;Max generation
Table takes the function of maximum, and Min represents the function taking minima.
Use above-mentioned formula, choose object to the ultimate range in the distance of each curve as this object feature away from
From, then choose the minimum object of characteristic distance in all objects as object immediate with interested event.Further
Ground, can come to be ranked up corresponding object according to this feature distance, to represent each object and interested thing from small to large
The degree of tapping into of part.
Such as, also as a example by Fig. 3, the result of curve matching is, address information based on user A simulates two curves 1
With 2, address information based on user B simulates a curve 3.Assume to there are multiple object.Wherein each object to three
Ultimate range in the distance of curve is respectively 5,3,5,6,9,8.......Then choose the thing of the minima 3 with ultimate range
Body is as closest to object, as shown in Figure 3.
The largest benefit of this method is formula argmini(maxj(distance(xi, Dj))) be in the prior art
Simply and it is standardized, and the instrument realizing it can be readily available.
Certainly, the invention is not restricted to this, those skilled in the art can also use other range formula according to its demand.Can
To use such as average distance minima, it is, use an object special as it to the meansigma methods of the distance of each curve
Levy distance, and the object of selected characteristic distance minimum is as immediate object.Can also use such as ultimate range square
Minima, it is, use object to the square value of the ultimate range in the distance of each curve as its characteristic distance,
And the object of selected characteristic distance minimum is as immediate object.
In step 210, utilize at least some of of Web message, to determined by mark close to the initial data of object
Note.
Such as, user when JIUYUE in 2011 23 days 7 56 distribute one Web message of cloth for " see four cars to knock into the back,
The most so awful!", and the photographic head that immediate photographic head is western entrance, Xin Jie Kou.Then can with " the knocking into the back " in Web message and time
Between " 2011/9/237:56 ", be used as first number of the raw data file vsd.vso that the photographic head at western entrance, Xin Jie Kou is obtained
According to, it is marked.
Further, can be ranked up for each close photographic head, such as, generate and include following content
Web page:
Knock into the back 2011/9/23 7:56 Xin Jie Kou western entrance vsd.vso
East, West Street, Xin Jie Kou mouth vsf.vso
West Street, Xin Jie Kou western entrance vsg.vso
User can click on the viewing of corresponding video file.Can also pass through natural language " knock into the back ", " JIUYUE 23 in 2011
Day " etc. carry out data retrieval.
In step 212, process 200 end.
As it was previously stated, the quantity of Web message is exponential.If execution each time processes 200, start in step 204
Be all Web message on network are processed if, then the time needed for process and calculate cost bigger.
According to one embodiment of present invention, a pre-treatment step can be included between step 202 and step 204.
Described pre-treatment step can use existing index technology to index all Web message issued on network in real time, then
The Web message the most relevant to interested event is taken out in step 204 based on index.
For example, it is possible to use participle technique carries out participle in real time to each Web message, according to pre-build
Keywords database, determines whether occur at least one key word in Web message, then, sets up the Web message a certain key word occur
And chain between this key word fetches and is indexed in keywords database.
It is also as a example by " rear-end collision " by Web message, is " automobile/knock into the back/" by this message participle.So use " vapour
Car ", " knocking into the back ", as index terms, builds inverted list, by search " automobile " or " knocking into the back ", can obtain this message.
Then, this link is used to extract the Web message relevant to key word for entering rapidly in step 204
One step processes.
Additionally, although Fig. 2 employs arrow line to indicate each step, but the invention is not restricted to this, but permissible
Perform each step in Fig. 2 in other sequences.Such as, step 204 and 206 execution sequence can be contrary.
Fig. 4 show according to an embodiment disclosed by the invention for labelling by Internet of Things object produce
The block diagram of the system 400 of initial data.
System 400 according to an embodiment of the invention includes Web message search engine 401, correlation detector
407, address information detector 409, proximity detector 411 and marker 411.Correlation detector 407 includes information filtering
Device 403 and time filter 405.
Web message search engine 401 is optional, its be not realize essential to the invention.Web message search engine
The 401 all Web message issued on index network in real time.
Correlation detector 407 is for detecting the Web message relevant to various events.Content filter 403 is used for filtering
Go out the Web message that content is relevant to various events.Time filter 405 is for filtering out the generation of issuing time and various events
Time Web message in preset range, and carry out what instant row filtration was issued in the time range of regulation with acquisition
The Web message of present situation is described.Other Web message will be dropped.
Address information detector 409 receives the related Web message from correlation detector 407, and extracts these Web and disappear
Address information in breath.Address information can be to use API from Web message extraction, it is also possible to is from the content of Web message
Filter out.Address information can have gps data form or text formatting.Address information detector 409 can include one
Individual transducer (not shown), for changing the form of described address information, such as, is converted into gps data form from text formatting.
Proximity detector 411 for based on the address information from address information detector 409, determines and is occurred
The immediate object of event.Specific embodiment has been carried out above describing in detail, is not repeated at this.
Marker 413 for based on corresponding Web message, come labelling from determined by the original number of immediate object
According to.
According to one embodiment of present invention, the result of labelling can be issued with forms such as webpage, document, texts, for
Further process.Such as, search engine can use the result of this labelling to scan for, in order to for using natural language to carry out
The user of inquiry provides relevant Query Result rapidly.
Fig. 5 shows the search realized according to one embodiment of present invention and processes the flow chart of example.Shown in Fig. 5
It it is a present invention application in inquiry.
As it is shown in figure 5, user can use " knocking into the back " to inquire about occurred rear crash event.Content filter 403 is found out
" knock into the back " webpage linked provide these webpages relevant with the querying condition of user in terms of content with key word.Temporal filtering
Device 405 filters out not all Web message in the range of required time, and processes remaining Web message.Temporal filtering
Device 405 is additionally based upon the content of Web message and carries out instant row filtration, to filter out Web message incoherent with present situation.Example
Such as the rear crash event of today that, user needs, therefore, including " yesterday .... knock into the back " or " long ago .... knock into the back " Web
Message is not of interest, thus removes these message.
Address information detector 409 obtains address information therein from remaining Web message.As it has been described above, in IoT
The positional information of object is known, is pre-stored in data base.Proximity detector 411 detects relevant to event of interest
Object.Marker 411 use Web message come at least partially to be marked each object to show that each object obtains
The semanteme of initial data.By using described labelling, the inquiry of natural language can be associated with initial data, thus
Provide the user with such as: " Query Result of return: the most or monitored the photographic head of " knocking into the back ", user can connect
Receive this photographic head and browse its data ".
Certainly, user is also based on labelling and excavates the relatedness between initial data.For example, it is possible to find out with
Once knock into the back relevant all photographic head, in order to obtains the data relevant with the generating process that this knocks into the back.
Fig. 6 shows the block diagram of the search engine realized according to one embodiment of present invention.This shown in Fig. 6
The object lesson of a bright realization.
As shown in Figure 6, search engine includes the system 400 described in Fig. 4.Additionally, search engine to be used for receiving user defeated
The module 601 entered and the information for inputting according to user and produced by system 400 carry out the module 602 retrieved.Then, institute
Obtain retrieval result to be returned to inquire about user.
Above the basic thought of the present invention is described, skilled artisan would appreciate that: the invention provides
One or more in advantages below:
Web message and IoT can be combined to provide intelligible IoT.
-distribution Web message gives relevant " object "
The observation of-use metadata object
--it is with natural language rather than with quantitative data, image, video etc.;
--its transmission emotion viewpoint rather than neutral-data;
--the different viewpoints of its reflection different people.
Abundant " object " by Web message
-identify the relation between instant microblogging notice and " object "
-distribute these notice as label to " object "
-support the search to object and data mining duty
--user can scan for natural language querying
--the microblogging notice that retrieval is relevant
Person of ordinary skill in the field knows, the present invention can be presented as system, method or computer program.
Therefore, the present invention can be to be implemented as following form, i.e. can be that hardware, completely software (include firmware, stay completely
Stay software, microcode etc.) or the referred to generally herein as software section of " circuit ", " module " or " system " and hardware components
Combination.It is also possible to take to be embodied in any tangible expression medium (medium of expression)
The form of computer program, this medium comprises computer can procedure code.
Can use one or more computer can or any combination of computer-readable medium.Computer can be used
Or computer-readable medium can be such as--but being not limited to--electricity, magnetic, light, electromagnetism, ultrared or half
The system of conductor, device, device or propagation medium.
Referring to method, device (system) and computer program according to the embodiment of the present invention flow chart and/
Or block diagram describes the present invention.It is clear that it is each in flow chart and/or each square frame of block diagram and flow chart and/or block diagram
The combination of square frame, can be realized by computer program instructions.These computer program instructions can be supplied to general purpose computer,
Special-purpose computer or the processor of other programmable data processing means, thus produce a kind of machine so that pass through computer
Or these instructions that other programmable data processing means performs, produce in the square frame in flowchart and/or block diagram and specify
The device of function/operation.
Flow chart in accompanying drawing and block diagram, it is illustrated that according to system, method and the computer journey of various embodiments of the invention
Architectural framework in the cards, function and the operation of sequence product.In this, each square frame in flow chart or block diagram can generation
One module of table, program segment or a part for code, a part for described module, program segment or code comprises one or more
For realizing the executable instruction of the logic function of regulation.It should also be noted that some as replace realization in, institute in square frame
The function of mark can also occur to be different from the order marked in accompanying drawing.Such as, the square frame that two succeedingly represent is actual
On can perform substantially in parallel, they can also perform sometimes in the opposite order, and this is depending on involved function.
The device (means) of counter structure, material, operation and all function limitations in claim or step
Equivalent, it is intended to include any performing this function for combined with other unit specifically noted in the claims
Structure, material or operation.The given description of this invention its object is to signal and describes, and is not exhaustive, also
It is not intended to the present invention to be limited to stated form.For person of an ordinary skill in the technical field, the most inclined
In the case of scope and spirit of the present invention, it is clear that may be made that many amendments and modification.Selection and explanation to embodiment, be
In order to explain the principle of the present invention and actual application best, person of an ordinary skill in the technical field is enable to understand, this
Invention can have the various embodiments with various change of applicable desired special-purpose.
Claims (17)
1. a method for the initial data that labelling is produced by the object in Internet of Things, including:
The Web message obtained is carried out correlation detection to obtain the Web message relevant to various events, described Web message
User uses various equipment to issue described Web message on network at any time;
Obtain the address information that described relevant Web message is comprised;
The object close with described various events is determined based on the address information obtained;And
Use at least part of content of described relevant Web message as metadata, labelling by determined by produce close to object
Initial data.
Method the most according to claim 1, wherein, described determines and described various events based on the address information obtained
The step of close object includes:
Obtain and same user-dependent address information from described relevant Web message;
Use curve matching to generate the curve of matching based on described acquired address information;And
Positional information based on the object in Internet of Things and the curve of institute's matching, determine the nearness of described object.
Method the most according to claim 2, wherein, according to the curve of positional information and the matching of described institute of each object
Minima in Ju Li or according to each object positional information and the ultimate range of the curve of described institute matching minima,
According to each object positional information and the matching of described institute curve average distance minima or according to each object
Positional information and the minima of the square value of the ultimate range of the curve of described institute matching, determine that each object is with of interest
The nearness of event.
Method the most according to claim 1, also includes:
The Web message occurred on network is indexed in real time;And
The all Web message relevant with the interested event in described various events are retrieved from the Web message after index.
Method the most according to claim 1, wherein, uses the issuing time of described relevant Web message and with interested
The relevant word of event, produce the metadata of the initial data produced for labelling by close object.
Method the most according to claim 5, wherein, based on described metadata, to the inquiry using natural language to carry out
Respond.
Method the most according to claim 2, also includes:
According to the degree of tapping into of each object, each object described is ranked up.
8. a system for the initial data that labelling is produced by the object in Internet of Things, including:
For the Web message obtained being carried out correlation detection to obtain the device of the Web message relevant to various events, described
The user of Web message uses various equipment to issue described Web message on network at any time;
For obtaining the device of the address information that described relevant Web message is comprised;
For determining the device of the object close with described various events based on the address information obtained;And
For use at least part of content of described relevant Web message as metadata, labelling by determined by close to object
The device of the initial data produced.
System the most according to claim 8, wherein, described various with described for determining based on the address information obtained
The device of the object that event is close includes:
For obtaining the device with same user-dependent address information from described relevant Web message;
For using curve matching to generate the device of the curve of matching based on described acquired address information;And
For positional information based on the object in Internet of Things and the curve of institute's matching, determine the dress of the nearness of described object
Put.
System the most according to claim 9, wherein, according to positional information and the curve of described institute matching of each object
Distance in minima or the minimum of ultimate range of curve of positional information and the matching of described institute according to each object
Value or according to each object positional information and the matching of described institute curve average distance minima or according to each thing
The positional information of body and the minima of the square value of the ultimate range of the curve of described institute matching, determine each object and closed
The nearness of the event of the heart.
11. systems according to claim 8, also include:
For the device that the Web message occurred on network is indexed in real time;And
Disappear for retrieving all Web relevant with the interested event in described various events from the Web message after index
The device of breath.
12. systems according to claim 8, wherein, use the issuing time of described relevant Web message and with interested
The relevant word of event, produce the metadata of the initial data produced for labelling by close object.
13. systems according to claim 12, wherein, based on described metadata, to looking into that use natural language is carried out
Inquiry responds.
14. systems according to claim 9, also include:
For the degree of tapping into according to each object, the device that each object described is ranked up.
15. 1 kinds of methods searching for object in Internet of Things, including:
Use natural language input inquiry item;And
Use described query term, metadata based on the object in Internet of Things, produce Search Results;
Wherein said metadata is to use the method according to any one of claim 1-7 to produce.
16. 1 kinds of equipment searching for object in Internet of Things, including:
For using the device of natural language input inquiry item;And
For using described query term, metadata based on the object in Internet of Things, produce the device of Search Results;
Wherein said metadata is to use the equipment according to any one of claim 8-14 to produce.
The search engine used on 17. 1 kinds of networks, including:
For receiving the module of user's input;
Equipment as according to any one of claim 8-14;And
Information for inputting according to user and produced by described equipment carries out the module retrieved.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110347155.9A CN103092880B (en) | 2011-10-31 | The method and system of the initial data that labelling is produced by the object in Internet of Things | |
DE102012218966.1A DE102012218966B4 (en) | 2011-10-31 | 2012-10-18 | Method and system for identifying original data generated by things in the Internet of Things |
GB1218783.7A GB2496268A (en) | 2011-10-31 | 2012-10-19 | Tagging original data generated in the internet of things |
US13/661,628 US8983926B2 (en) | 2011-10-31 | 2012-10-26 | Method and system for tagging original data generated by things in the internet of things |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110347155.9A CN103092880B (en) | 2011-10-31 | The method and system of the initial data that labelling is produced by the object in Internet of Things |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103092880A CN103092880A (en) | 2013-05-08 |
CN103092880B true CN103092880B (en) | 2016-12-14 |
Family
ID=
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078750A1 (en) * | 2002-08-05 | 2004-04-22 | Metacarta, Inc. | Desktop client interaction with a geographical text search system |
CN101675429A (en) * | 2007-01-31 | 2010-03-17 | 名誉捍卫者公司 | Identifying and changing personal information |
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078750A1 (en) * | 2002-08-05 | 2004-04-22 | Metacarta, Inc. | Desktop client interaction with a geographical text search system |
CN101675429A (en) * | 2007-01-31 | 2010-03-17 | 名誉捍卫者公司 | Identifying and changing personal information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8983926B2 (en) | Method and system for tagging original data generated by things in the internet of things | |
CN108268582B (en) | Information query method and device | |
CN105706080B (en) | Augmenting and presenting captured data | |
CN105095211B (en) | The acquisition methods and device of multi-medium data | |
US20150112963A1 (en) | Time and location based information search and discovery | |
US9990368B2 (en) | System and method for automatic generation of information-rich content from multiple microblogs, each microblog containing only sparse information | |
US8972498B2 (en) | Mobile-based realtime location-sensitive social event engine | |
US9053194B2 (en) | Method and apparatus for correlating and viewing disparate data | |
CN103577549A (en) | Crowd portrayal system and method based on microblog label | |
CN106484764A (en) | User's similarity calculating method based on crowd portrayal technology | |
WO2014197216A1 (en) | Photo and video search | |
CN106383887A (en) | Environment-friendly news data acquisition and recommendation display method and system | |
CN104536956A (en) | A Microblog platform based event visualization method and system | |
KR101462348B1 (en) | System and method for matching users having matter of common interest and change of talent using tag applicable to mobile messenger | |
KR101754371B1 (en) | Method for providing SNS contents attached tag | |
CN103399855B (en) | Behavior intention determining method and device based on multiple data sources | |
JP5725619B2 (en) | Apparatus, program, and method for tagging position information with keywords based on a large number of comment sentences | |
KR101290325B1 (en) | Apparatus and method for searching personalized contents of a traver destination based on user position log of a mobile terminal | |
US20210117467A1 (en) | Systems and methods for filtering of computer vision generated tags using natural language processing | |
KR20100071359A (en) | Apparatus and method for information search on the basis of tag and method for tag management | |
CN109857869A (en) | A kind of hot topic prediction technique based on Ap increment cluster and network primitive | |
Kim et al. | TwitterTrends: a spatio-temporal trend detection and related keywords recommendation scheme | |
US20140032675A1 (en) | Method and system for pushing associated users in social networking service network | |
CN106777395A (en) | A kind of topic based on community's text data finds system | |
KR20180087772A (en) | Method for clustering and sharing images, and system and application implementing the same method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |