CN110334133A - Rule digging method and device, electronic equipment and computer readable storage medium - Google Patents

Rule digging method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN110334133A
CN110334133A CN201910627281.6A CN201910627281A CN110334133A CN 110334133 A CN110334133 A CN 110334133A CN 201910627281 A CN201910627281 A CN 201910627281A CN 110334133 A CN110334133 A CN 110334133A
Authority
CN
China
Prior art keywords
space
time
granularity
correlation rule
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910627281.6A
Other languages
Chinese (zh)
Other versions
CN110334133B (en
Inventor
阮思捷
鲍捷
俞自生
郑宇�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong intelligent city big data research institute
Original Assignee
Jingdong City Beijing Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong City Beijing Digital Technology Co Ltd filed Critical Jingdong City Beijing Digital Technology Co Ltd
Priority to CN201910627281.6A priority Critical patent/CN110334133B/en
Publication of CN110334133A publication Critical patent/CN110334133A/en
Application granted granted Critical
Publication of CN110334133B publication Critical patent/CN110334133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Present disclose provides it is a kind of obtain space time correlation rule method, comprising: obtain multiple features and each feature at the beginning sky granularity on initial value;Based on initial value, value of each feature in multiple high-level space-time granularities is obtained, wherein space-time granularity includes time granularity and spatial granularity;The time granularity of high-level space-time granularity is greater than the time granularity of initial space-time granularity and/or the spatial granularity of high-level space-time granularity is greater than the spatial granularity of initial space-time granularity;Value based on each feature in multiple high-level space-time granularities, generates multiple space-time affairs;The condition flag and target signature in multiple features are obtained, and obtains the transaction information about condition flag and target signature from multiple space-time affairs;Transaction information based on condition flag and target signature, creation space time correlation rule.The disclosure additionally provides a kind of device, electronic equipment and computer readable storage medium for obtaining space time correlation rule.

Description

Rule digging method and device, electronic equipment and computer readable storage medium
Technical field
This disclosure relates to field of computer technology, more particularly, to a kind of method and dress for obtaining space time correlation rule It sets, electronic equipment and computer readable storage medium.
Background technique
With the fast development of computing technique, memory technology and high-speed data acquiring technology, in environmental monitoring, traffic, object The fields such as stream, industrial production have accumulated the largely data resource with time and space correlation, find the association between space-time data Rule has important application value for planning construction, information service etc..For example, the data of air quality monitoring station, meteorology The data stood, traffic track data, plant emissions data etc. magnanimity, multi-source, different field data sent out in same space-time It is raw, a variety of potential incidence relations have been usually contained in them.Association rule mining be one kind in mass data find variable it Between interesting sexual intercourse method, it is therefore an objective to from data set analyze data item between potential incidence relation, announcement wherein accumulate The mode valuable for user contained.
During realizing that the present invention discloses design, at least there are the following problems in the prior art for inventor's discovery:
The excavation of existing correlation rule is all based on explicit affairs, all items being such as present in same shopping list Constitute an affairs, but this relationship that multi-source heterogeneous space-time data is not explicit, existing space-time data analytical technology, Can only excavate between extreme value and extreme value, such as larger, smaller etc fuzzy relation, without more specifically value interval it Between quantization relationship.Therefore, the incidence relation for how obtaining the quantization between space-time data is asked as the technology of urgent need to resolve Topic.
Summary of the invention
In view of this, present disclose provides a kind of methods of acquisition space time correlation rule realized by electronic equipment, comprising: Obtain the initial value of multiple features and each feature at the beginning in empty granularity;Based on the initial value, each feature is obtained Value in multiple high-level space-time granularities, wherein the space-time granularity includes time granularity and spatial granularity;The high level The time granularity of secondary space-time granularity be greater than the initial space-time granularity time granularity and/or the high-level space-time granularity Spatial granularity is greater than the spatial granularity of the initial space-time granularity;Based on each feature in multiple high-level space-time granularities Value, generate multiple space-time affairs;Obtain the condition flag and target signature in the multiple feature, and from it is the multiple when The transaction information about the condition flag and target signature is obtained in empty affairs;Based on the condition flag and target signature Transaction information, creation space time correlation rule.
In accordance with an embodiment of the present disclosure, each space-time granularity includes multiple space-time uniques;The initial value of the feature Including the value in multiple space-time uniques under feature at the beginning empty granularity;The feature is in multiple high-level space-time granularities Value include value of the feature in multiple space-time uniques under each high-level space-time granularity;Each space-time transaction packet Include the value of feature and feature in the space-time unique in a space-time unique.
In accordance with an embodiment of the present disclosure, described to obtain from the multiple space-time affairs about the condition flag and target The transaction information of feature comprises determining that the public space-time granularity of the minimum of the condition flag and the target signature;From described more The thing about the condition flag and the target signature in the public space-time granularity of the minimum is obtained in a space-time affairs Business information.
In accordance with an embodiment of the present disclosure, the transaction information based on the condition flag and target signature creates space-time Correlation rule includes: in the case where the quantity of the target signature is one, based on hierarchical clustering algorithm to the target spy Sign carries out discretization, multiple discretization sections of target signature is obtained, using each discretization section as a destination item;? In the case that the quantity of the condition flag is multiple, multiple condition flags are carried out simultaneously using multidimensional tree structure discrete Change, the discretization section of each condition flag is obtained, using each discretization section as a condition item;And based on described Condition item and destination item building space time correlation rule.
In accordance with an embodiment of the present disclosure, described based on the condition item and destination item building space time correlation rule Include: the combination according to each condition item, obtains a variety of condition item set;Under a variety of spatiotemporal modes, building about The space time correlation of every kind of condition item set and destination item rule;Determine each space-time under every kind of spatiotemporal mode The support and confidence level of correlation rule;And filter out support and confidence level meets the space time correlation rule of preset condition, As interesting space time correlation rule.
In accordance with an embodiment of the present disclosure, described based on the condition item and destination item building space time correlation rule Further include: determine the dominance relation of each space time correlation rule in the interesting space time correlation rule;From it is described interesting when The space time correlation rule not dominated by other space time correlations rule is filtered out in null Context rule;Wherein, when described interesting Null Context rule includes the first correlation rule and the second correlation rule, if the spatial model of the first correlation rule and time mode ratio Second correlation rule is big, and the former piece of the first correlation rule about the second correlation rule of beam ratio is loose, and the consequent of the first correlation rule is about The second correlation rule of beam ratio is compact, and the support of the first correlation rule and confidence level are higher than the second correlation rule, then it is assumed that the One correlation rule dominates the second correlation rule.
In accordance with an embodiment of the present disclosure, it for every kind of spatiotemporal mode, determines each space time correlation regular grid DEM and sets Reliability includes: that each space time correlation rule is constituted distributed elastic data set;The multiple space-time affairs are broadcast to On the slave node of distributed system, each space time correlation regular grid DEM and confidence level described in Distributed Parallel Computing.
On the other hand the disclosure provides a kind of device for obtaining space time correlation rule, comprising: initial module, for obtaining Multiple features and each feature initial value in empty granularity at the beginning;Layer module is based on the initial value, obtains every Value of a feature in multiple high-level space-time granularities, wherein the time granularity of the high-level space-time granularity is coarser than described The time granularity of initial space-time granularity and/or the spatial granularity of the high-level space-time granularity are coarser than the initial space-time granularity Spatial granularity;Logging modle generates multiple for the value based on each feature in multiple high-level space-time granularities Space-time affairs;Characteristic module, for obtaining condition flag and target signature in the multiple feature, and from the multiple space-time The transaction information about the condition flag and target signature is obtained in affairs;Rule module, for being based on the condition flag With the transaction information of target signature, space time correlation rule is created.
On the other hand the disclosure provides a kind of electronic equipment, comprising: one or more processors;Storage device is used for Store one or more programs, wherein when one or more of programs are executed by one or more of processors, so that One or more of processors execute the method according to above-mentioned any one.
Present disclose provides a kind of computer-readable mediums, are stored thereon with executable instruction, which is held by processor Processor is set to execute the method according to above-mentioned any one when row.
In accordance with an embodiment of the present disclosure, it can at least be partially solved in the prior art due to multi-source heterogeneous space-time data The problem of not explicit this relationship, existing space-time data analytical technology can only provide qualitative instruction, and therefore It may be implemented the different original space-time data collection processing of time granularity and spatial granularity as excavation space time correlation rule Feature then obtains the technical effect of the incidence relation of the quantization between space-time data.
Detailed description of the invention
By referring to the drawings to the description of the embodiment of the present disclosure, the above-mentioned and other purposes of the disclosure, feature and Advantage will be apparent from, in the accompanying drawings:
Fig. 1 diagrammatically illustrates the exemplary application field of the method for the acquisition space time correlation rule according to the embodiment of the present disclosure Scape;
Fig. 2 diagrammatically illustrates the flow chart of the method for the acquisition space time correlation rule according to the embodiment of the present disclosure;
Fig. 3 diagrammatically illustrates being polymerize on high-level time granularity from initial time granularity according to the embodiment of the present disclosure Schematic diagram;
Fig. 4 diagrammatically illustrates being polymerize in high-level spatial granularity from initial space granularity according to the embodiment of the present disclosure Schematic diagram;
Fig. 5 diagrammatically illustrate according to the embodiment of the present disclosure based on hierarchical clustering algorithm to the target signature carry out from The schematic diagram of dispersion;
Fig. 6 diagrammatically illustrate according to the use multidimensional tree structure of the embodiment of the present disclosure simultaneously to multiple condition flags into The schematic diagram of row discretization;
Fig. 7 diagrammatically illustrates the process signal of the method for the acquisition space time correlation rule according to another embodiment of the disclosure Figure;
Fig. 8 diagrammatically illustrates the block diagram of the device of the acquisition space time correlation rule according to the embodiment of the present disclosure;And
Fig. 9 diagrammatically illustrates the calculating of the method for being adapted for carrying out acquisition space time correlation rule according to the embodiment of the present disclosure The block diagram of machine system.
Specific embodiment
Hereinafter, will be described with reference to the accompanying drawings embodiment of the disclosure.However, it should be understood that these descriptions are only exemplary , and it is not intended to limit the scope of the present disclosure.In the following detailed description, to elaborate many specific thin convenient for explaining Section is to provide the comprehensive understanding to the embodiment of the present disclosure.It may be evident, however, that one or more embodiments are not having these specific thin It can also be carried out in the case where section.In addition, in the following description, descriptions of well-known structures and technologies are omitted, to avoid Unnecessarily obscure the concept of the disclosure.
Term as used herein is not intended to limit the disclosure just for the sake of description specific embodiment.It uses herein The terms "include", "comprise" etc. show the presence of the feature, step, operation and/or component, but it is not excluded that in the presence of Or add other one or more features, step, operation or component.
There are all terms (including technical and scientific term) as used herein those skilled in the art to be generally understood Meaning, unless otherwise defined.It should be noted that term used herein should be interpreted that with consistent with the context of this specification Meaning, without that should be explained with idealization or excessively mechanical mode.
It, in general should be according to this using statement as " at least one in A, B and C etc. " is similar to Field technical staff is generally understood the meaning of the statement to make an explanation (for example, " system at least one in A, B and C " Should include but is not limited to individually with A, individually with B, individually with C, with A and B, with A and C, have B and C, and/or System etc. with A, B, C).Using statement as " at least one in A, B or C etc. " is similar to, generally come Saying be generally understood the meaning of the statement according to those skilled in the art to make an explanation (for example, " having in A, B or C at least One system " should include but is not limited to individually with A, individually with B, individually with C, with A and B, have A and C, have B and C, and/or the system with A, B, C etc.).
Embodiment of the disclosure provides a kind of method of acquisition space time correlation rule realized by electronic equipment, comprising: Obtain the initial value of multiple features and each feature at the beginning in empty granularity;Based on initial value, each feature is obtained more Value in a high-level space-time granularity, wherein space-time granularity includes time granularity and spatial granularity;High-level space-time granularity Time granularity is greater than the time granularity of initial space-time granularity and/or the spatial granularity of high-level space-time granularity is greater than initial space-time The spatial granularity of granularity;Value based on each feature in multiple high-level space-time granularities, generates multiple space-time affairs;It obtains Condition flag and target signature in multiple features, and obtain from multiple space-time affairs about condition flag and target signature Transaction information;Transaction information based on condition flag and target signature, creation space time correlation rule.
In accordance with an embodiment of the present disclosure, the disclosure passes through the feature for extracting space-time data, and enterprising in multiple space-time granularities Row calculates value, and the data of multiple space-time granularities are carried out space-time connection, generates affairs, the condition flag specified based on user And target signature, corresponding affairs are inquired, and take out the characteristic of user's care, the characteristic based on taking-up carries out space-time The creation of correlation rule not can be used directly correlation rule in this way, can solve multi-source heterogeneous space-time data does not have affairs The problem of excavation, and the incidence relation of space-time data is quantified, form accurate correlation rule.
Fig. 1 diagrammatically illustrates the exemplary application field of the method for the acquisition space time correlation rule according to the embodiment of the present disclosure Scape.
It should be noted that being only the method that can apply the acquisition space time correlation rule of the embodiment of the present disclosure shown in Fig. 1 Application Scenarios-Example, to help skilled in the art to understand the technology contents of the disclosure, but be not meant to the disclosure reality Applying example may not be usable for other equipment, system, environment or scene.
As shown in Figure 1, the method for the acquisition space time correlation rule of the embodiment of the present disclosure can be used for excavating air pollution The incidence relation of quantization between each factor and air quality.
Air pollution problems inherent is more and more severeer, seriously affects the trip and health of people, and still, the air pollution origin cause of formation is multiple It is miscellaneous, lead to many because being known as of air pollution, for example, flue gas, weather that the tail gas of the discharge of vehicle 110, factory chimney 120 are discharged 130 be all the factor of influence air quality, but in the prior art, it not can determine that the numberical range and air quality of influence factor Between quantization relationship, for example, vehicle flowrate will lead to when much, air quality is poor, what numberical range space-time wind speed is in Gas is quality.
The method of the acquisition space time correlation rule of the embodiment of the present disclosure can be used for obtaining each factor of air pollution with The incidence relation of quantization between air quality, available for example " calm, temperature is in [0,8] degree Celsius, certain coal steam-electric plant smoke Discharge amount be [1000,1200] every cubic metre of milligram → air quality it is poor ", " at the weekend of Haidian District Beijing, vehicle flowrate [2, 3] thousand per hour, wind speed is excellent in [10,15] metre per second (m/s) → air quality " etc. more accurate conclusion.By influence factor with The incidence relation of quantization between air quality can make the policy-making bodies such as government know the main coverage of influence factor clearly, Improve air quality so as to take appropriate measures, such as control vehicle flowrate or control factory smoke discharge amount etc. are arranged It applies.
Fig. 2 diagrammatically illustrates the flow chart of the method for the acquisition space time correlation rule according to the embodiment of the present disclosure.
As shown in Fig. 2, the method for the acquisition space time correlation rule of the embodiment of the present disclosure includes operation S210~operation S250.
In operation S210, multiple features are obtained, and obtain initial value of each feature at the beginning in empty granularity.
For example, feature can be PM2.5 soiling value, vehicle flowrate, wind speed, factory smoke discharge amount, factory's quantity, section letter Breath, track of vehicle etc..
Feature can be classified as three classes: space-time static data, space quiet hour dynamic data, space-time dynamic data.Factory The data such as equal points of interest (POI), road section information belong to space-time static data not as time and space change;Air matter Amount monitoring station position, meteorological station location, factory location immobilize, but read as the time changes, therefore PM2.5 pollutes The data such as value, wind speed and factory smoke discharge amount belong to space quiet hour dynamic data;Track of vehicle data etc., time and sky Between can all change, belong to space-time dynamic data.Space representation is generally point (such as POI), line segment (such as road network), closed polygon (such as Administrative boundaries), time indicate to be usually timestamp or period.
In accordance with an embodiment of the present disclosure, space-time granularity includes time granularity and spatial granularity, and spatial granularity from fine to coarse may be used To divide are as follows: grid --- administrative area --- city, wherein grid for example can be the geographic area of 1kmx1km, each administration Area includes several specific grids, such as Haidian District includes grid 1, grid 2, grid 3 etc., city include it is specific several Administrative area, such as Beijing include Haidian District, Chaoyang District, Xicheng District etc..Time granularity can divide from fine to coarse are as follows: 5 points Clock --- 30 minutes --- --- day --- week --- moons hour.
In accordance with an embodiment of the present disclosure, multi-source data occurs being considered an affairs in same space-time field jointly, because This needs to pre-process feature to the incidence relation established between each space-time characteristic, it is therefore an objective to by time granularity, sky Between the different original time-space data analysis of granularity be data for excavating space time correlation rule, carried out according to different demands Multi-level space-time polymerization, and by establishing hybrid index, accelerate to excavate.
Firstly, carrying out feature initial particle size and initial value extraction.
For having the sensing such as data, such as air quality data, meteorological data of spatial information and temporal information simultaneously The data that device can monitor, can be using the sampling period as initial time granularity, using monitoring range as initial space granularity, no There may be different initial space-time granularities with feature, for example, air quality monitoring station is primary every monitoring in 1 hour, and monitor Be each administrative area air quality, then the initial time granularity of air quality data (such as PM2.5) is 1 hour, initially Spatial granularity is administrative area, and the initial space-time granularity of air quality data is (administrative area, 1 hour).For another example meteorological data is every It is primary every half an hour monitoring, and monitoring range is each administrative area, then the initial space-time granularity of meteorological data be (administrative area, 30 Minute).It, can be using space-time granularity when counting as initial space-time granularity, for example, vehicle flowrate for statistical data such as vehicle flowrates Statistics is primary within every 5 minutes, and what is counted is vehicle flowrate in each grid, then the initial space-time granularity of vehicle flowrate be (grid, 5 Minute).If granularity when feature is monitored or counts is lower than above-mentioned minimum time granularity or minimum space granularity, just Beginning space-time granularity can handle as minimum time granularity or minimum space granularity, for example, if the scope of statistics of vehicle flowrate is each A section, then the average vehicle flow where each section is calculated in grid, and using grid as the initial space of vehicle flowrate Granularity.
Each space-time granularity includes multiple space-time uniques, for example, space-time granularity (administrative area, one hour) includes (sea Shallow lake area, 8:00~9:00), (Chaoyang District, 12:00~13:00), the space-time uniques such as (Xicheng District, 15:00~16:00);Space-time grain Spending (grid, day) includes the space-time uniques such as (grid on January 1st, 1,2019), (grid 2 months 2,2019 No. 5).Space-time granularity (administrative area, the moon) includes the space-time uniques such as (Haidian District, in January, 2019), (Dongcheng District, 2 months 2019).
For example, obtaining initial value of the feature at the beginning in empty granularity includes that obtain feature more under empty granularity at the beginning Value in a space-time unique.For example, the initial space-time granularity of PM2.5 is (administrative area, 1 hour), statistics obtain PM2.5 when The value in each space-time unique under empty granularity (administrative area, 1 hour), as the initial value under the initial space-time granularity of PM2.5, Statistical result is for example shown in table 1.
Table 1
Spatial dimension Time range Initial value
Haidian District 0:00~1:00 64
Haidian District 1:00~2:00 70
Chaoyang District 0:00~1:00 80
Chaoyang District 1:00~2:00 60
Xicheng District 0:00~1:00 75
For non-temporal segment data, i.e. not no data such as data of temporal information, such as thermal power plant's quantity, road section information, Because those data are fixed and invariable in a long time, thus the influence very little of its time granularity, can be processed into most Small time granularity (such as 5 minutes).The spatial granularity of the data such as thermal power plant's quantity, road section information can be grid, can count Initial value of the quantity of thermal power plant as thermal power plant's quantity in each grid counts level-one road, second grade highway, intersection in each grid Initial value of the crossing quantity as road section information.
For non-geographic area data, non-temporal segment data is the data of not spatial information, can handle into minimum Spatial granularity (such as grid of 1km*1km).
In accordance with an embodiment of the present disclosure, it extracts after obtaining initial space-time granularity and the initial value of each feature, by each spy Sign is polymerize in each high-level space-time granularity.Due to the difference in space-time data source, have some data source spatial granularities or Time granularity is very big, and if the spatial granularity of meteorological data is administrative area, the time granularity of air quality data is half an hour.
Since different data sources space-time granularity is different, the affairs that generation is connected more than minimum public space-time granularity are just intentional Justice.User need excavate be characterized in it is input by user, but wait user input after, then go processing different data collection spy Sign, causes calculation amount larger.Gather in each high-level space-time granularity so operation S220 can be first passed through and precalculate feature Close as a result, counting value condition of each feature in each space-time granularity.
In operation S220, it is based on initial value, obtains value of each feature in multiple high-level space-time granularities.
Wherein, the time granularity of high-level space-time granularity be greater than initial space-time granularity time granularity and/or it is high-level when The spatial granularity of empty granularity is greater than the spatial granularity of initial space-time granularity.The time granularity of i.e. high-level space-time granularity and space grain At least one of degree is greater than initial space-time granularity.
For example, high-level space-time granularity for example can be (grid, week), (grid if initial space-time granularity is (grid, day) Lattice, the moon), (administrative area, day), (administrative area, week), (administrative area, the moon).
Obtaining value of the feature in multiple high-level space-time granularities may include: to obtain feature in each high-level space-time The value in multiple space-time uniques under granularity.It is taken for example, can be polymerize from initial time granularity on high-level time granularity Value, and it polymerize value in high-level spatial granularity from initial space granularity.
Fig. 3 diagrammatically illustrates being polymerize on high-level time granularity from initial time granularity according to the embodiment of the present disclosure Schematic diagram.
As shown in figure 3, having had learned that Haidian District PM2.5 in space-time granularity (row by above step by taking PM2.5 as an example Administrative division, hour) on value, that is, known PM2.5 (Haidian District, on January 1st, 2019 0:00~1:00), (Haidian District, On January 1st, 2019 1:00~2:00) ..., in the space-time uniques such as (Haidian District, on January 1st, 2019 23:00~24:00) just Initial value.Data of the PM2.5 in Haidian District 24 hours one day can be taken into mean value, obtain PM2.5 in Haidian District January 1 in 2019 Day value, and so on, obtain PM2.5 Haidian District in January, 2019 daily in value.Then, by PM2.5 in Haidian District 1 Data in months 31 days take mean value, obtain PM2.5 in the value in Haidian District in January, 2019, upward step by step with this time range Mode obtain the result that each feature polymerize on high-level time granularity.
Fig. 4 diagrammatically illustrates being polymerize in high-level spatial granularity from initial space granularity according to the embodiment of the present disclosure Schematic diagram.
As shown in figure 4, when for example having had learned that a certain feature January in 2019 of 0:00~1:00 on the 1st by above step Between initial value of the PM2.5 in each grid in section, can by this feature in Haidian District each grid (grid 1, grid 2 ..., grid 10) data take mean value, obtain this feature during this period of time in the value of Haidian District, and so on, be somebody's turn to do Feature during this period of time in the value of Chaoyang District and Xicheng District, with this spatial dimension step by step upward mode obtain it is each The result that feature polymerize on high-level time granularity.
By the above polymerization methods, value of the available each feature in each high-level space-time granularity.Then, lead to It crosses operation S230 and generates space-time affairs using value of each feature in each high-level space-time granularity, by corresponding space-time granularity Cross-domain feature carry out space-time connection.
S230 is being operated, the value based on each feature in multiple high-level space-time granularities generates multiple space-time affairs.
Wherein, generating space-time affairs includes: by the value of feature and feature in space-time unique in each space-time unique Statistics is a space-time affairs.The corresponding space-time unique of each space-time affairs, spatial dimension and time range is identical each Item feature constitutes a space-time affairs, and constructs different tables of data for the affairs of different space-time granularities.
For example, the feature being related in space-time unique (administrative area, 1 hour) includes PM2.5, wind speed, vehicle flowrate, flue gas emission The features such as amount, factory's quantity, then the value by each feature and each feature in the space-time unique constitutes a space-time affairs.
One space-time affairs d is a triple < A, srange, trange>.A is a binary group<a, q>composition set. Wherein a is some feature, and q is the value of this feature within this time range.srangeFor the corresponding spatial dimension of the affairs, trangeFor the corresponding time range of the affairs.Hereinafter, the collection of all space-time affairs d is collectively referred to as global space-time affairs D.
After statistics obtains each space-time affairs, user can input oneself interested target signature Iy(alternatively referred to as Consequent feature or subsequent feature) and may one or more condition flags (alternatively referred to as former piece spies relevant to the target signature Sign or guide's feature), excavate the incidence relation between condition flag and target signature.
In operation S240, the condition flag and target signature in multiple features are obtained, and obtain from multiple space-time affairs Transaction information about condition flag and target signature.
For example, the target signature of user's input is PM2.5, condition flag is wind speed and vehicle flowrate, then next can be from The space-time affairs about condition flag and target signature, and the related space-time thing based on taking-up are taken out in above-mentioned overall situation space-time affairs The incidence relation between PM2.5 and wind speed and vehicle flowrate is excavated in business.
In accordance with an embodiment of the present disclosure, obtaining from multiple space-time affairs about condition flag and target in S240 is operated The transaction information of feature includes operation S241~operation S242.
In operation S241, the public space-time granularity of the minimum of condition flag and target signature is determined.
Wherein, the minimum common time granularity of feature A, B and C can be with the maximum time in the initial time granularity of A, B and C The minimum public space granularity of granularity, feature A, B and C can be with maximum spatial granularity in the initial space granularity of A, B and C, example Such as, the initial space-time granularity of feature A is (administrative area, hour), and the initial space-time granularity of feature B is (grid, minute), feature C Initial space-time granularity be (grid, day), then the public space-time granularity of minimum of A, tri- features of B, C are (administrative area, day).
The initial space-time granularity of PM2.5 is, for example, (administrative area, 1 hour), and the initial space-time granularity of wind speed is, for example, (administrative Area, 30 minutes), the initial space-time granularity of vehicle flowrate be, for example, (grid, 5 minutes), then the minimum of PM2.5, wind speed and vehicle flowrate Public space-time granularity is (administrative area, 1 hour).
In operation S242, from obtained in multiple space-time affairs in minimum public space-time granularity about condition flag and mesh Mark the transaction information of feature.
For example, minimum public space-time granularity (administrative area, 1 hour) include (Haidian District, 0:00~1:00), (Haidian District, 1: 00~2:00) ..., (Chaoyang District, 0:00~1:00), the space-time uniques such as (Chaoyang District, 1:00~2:00), each space-time unique pair Answer a space-time affairs.It takes out in those space-time affairs from above-mentioned global space-time affairs D comprising PM2.5, wind speed and vehicle flowrate The space-time affairs subset D of three features1
In operation S250, the transaction information based on condition flag and target signature, creation space time correlation rule.Namely based on Above-mentioned space-time affairs subset D1Carry out the excavation of space time correlation rule.
In traditional Boolean Association Rules excavate, a commodity only occur or do not occur two kinds of possibility.Boolean's association The method for digging of rule is: first construction includes an item destination aggregation (mda), traverses all affairs, calculates support.It is supported meeting 1 item collection of degree threshold value is combined candidate 2 item collections of generation.All affairs are traversed again, obtain frequent 2 item collection.And so on, until There is no frequent item set generation.Again from all frequent item sets, correlation rule is obtained.
Since the correlation rule that we excavate is quantization, if to apply Boolean Association Rules excavation, direct way Be each feature is carried out in advance discretization (such as it is wide divide, etc. frequency divide etc.), then by each discretized features area Between regard a project, then with traditional method for digging.But the method for this pre- division and either with or without guidance, it is more likely that no Useful information out.Due to feature discretization mode by other features joint effect, it is contemplated that the phase between feature Mutual relation, the embodiment of the present disclosure carry out discretization to whole features simultaneously.The area for simultaneously needing that same feature mining is avoided to obtain Between intersect, and generate the meticulous situation of correlation rule.Based on the demand, the embodiment of the present disclosure is proposed based on tree knot The mode of the feature discretization of structure.For target signature, the section of hierarchical clustering building hierarchy is used.For condition flag, The distribution for considering all features simultaneously utilizes the section of n dimension R- tree tectonic remnant basin.
In accordance with an embodiment of the present disclosure, operation S250 includes operation S251~operation S253.
In operation S251, in the case where the quantity of target signature is one, based on hierarchical clustering algorithm to target signature Discretization is carried out, multiple discretization sections of target signature are obtained, it (can also using each discretization section as a destination item Referred to as consequent project).
Usually only include a feature for target signature, the method for hierarchical clustering can be used to space-time object in mesh The value cluster in feature is marked, carrys out discretization consequent, and only retain the section for meeting support threshold.
Fig. 5 diagrammatically illustrate according to the embodiment of the present disclosure based on hierarchical clustering algorithm to the target signature carry out from The schematic diagram of dispersion.
As shown in figure 5, target signature is, for example, PM2.5, it is ascending in Y-axis to sequentially list space-time affairs subset D1In Whole values of PM2.5, according to target signature value be distributed, will preferentially merge into section apart from close value, then successively to It is upper formed stratification consequent value interval, such as obtain section [30,50], [50,80] ..., [182,261], [30,105], [130,261], [30,261].
Each discretization section of target signature forms a destination item Iy, each project IyFor triple < a, a l, U >, wherein a is some target signature, and l and u are the minimum and maximum value of this feature, the i.e. value at section both ends.For For the sake of unified, if the value of a feature be classification (value of such as weather is fine, rain), l=u, l and u is enabled for example may be used Different weather patterns is represented to take the numerical value such as 0,1,2.
In operation S252, in the case where the quantity of condition flag is multiple, using multidimensional tree structure simultaneously to multiple Condition flag carries out discretization, the discretization section of each condition flag is obtained, using each discretization section as a condition Project (alternatively referred to as former piece project).
Fig. 6 diagrammatically illustrate according to the use multidimensional tree structure of the embodiment of the present disclosure simultaneously to multiple condition flags into The schematic diagram of row discretization.
As shown in fig. 6, being a point in n-dimensional space for the n-dimensional space that each condition flag is constituted.Based on this One n WeiRShu of a little point buildings.
Condition flag is, for example, wind speed and vehicle flowrate, since the quantity of condition flag is 2, it is possible to create one 2 dimension R It sets, it is ascending on X1 axis to sequentially list space-time affairs subset D1Whole values of middle wind speed, it is ascending on X2 axis successively to arrange Space-time affairs subset D out1Whole values of middle vehicle flowrate, each affairs correspond to a point on 2 dimensional plane, each in figure Rectangular Bounding Volume forms one 2 dimension section.The quantity for the point for including in the leaf node of n WeiRShu needs to meet minimum support Threshold value, leaf node and internal node represent a n dimension rectangular parallelepiped structure, such as in the case where condition flag is 3, leaf Node and internal node constitute one 3 dimension cuboid, these points can hierarchically be constituted Rectangular Bounding Volume using R- tree, obtained To the discrete segment of stratification.
It, will using each value range of X1 axis defined by each Rectangular Bounding Volume as the various discrete section of wind speed Various discrete section of each value range of X2 axis defined by each Rectangular Bounding Volume as vehicle flowrate successively obtains every The discretization section of a condition flag, using each discretization section as a condition item.(unit is, for example, for example, wind speed M/s discretization section) include [0.3,1.0], [1.0,2.8], [2.8,3.1] ..., vehicle flowrate (unit is, for example, thousand/it is small When) discretization section include [0.8,1.9], [1.9,2.6], [2.6,4.2] ....
Each discretization section of condition flag forms a project, and each project is triple < a, l, u a <, In, a is some condition flag, and l and u are the minimum and maximum value of this feature, the i.e. value at section both ends.In order to unified For the sake of, if the value of a feature is classification (value of such as weather be fine, rain), enable l=u, l and u for example and can take 0, 1, the numerical value such as 2 represent different weather patterns.
In operation S253, based on condition item derived above and destination item building space time correlation rule.
Space-time item collection C can be formed based on conditions above project and destination itemST, wherein space-time item collection CSTIt is one three TupleWhereinSet (the same characteristic features constituted for all conditions project and destination item At most occur primary), spatternFor the spatial model of the item collection, tpatternFor the time mode of the item collection.Wherein, spatternExample It such as can be Haidian District, Beijing spatial model, tpatternSuch as it can be, every morning, the times such as daily daytime at daily 8 points Mode.
If a space-time affairs d supports item collection CST, and if only if space-time affairs d and support item collection CSTMeet ( So that l≤q≤u) it sets up.Wherein, d.srangeIndicate the spatial dimension of space-time affairs d, CST.spatternIndicate space-time item collection CST Spatial model,It can indicate that the spatial dimension of space-time affairs d falls in space-time item collection CST's In spatial model, for example, the spatial dimension of space-time affairs d is grid, space-time item collection CSTSpatial model be administrative area, then it is assumed that The spatial dimension of space-time affairs d falls in space-time item collection CSTSpatial model in.d.trangeIndicate the time range of space-time affairs d, CST.tpatternIndicate space-time item collection CSTTime mode,Can indicate space-time affairs d when Between range include space-time item collection CSTTime mode, such as space-time affairs d time range be 6:00~7:00, space-time item collection CSTTime mode be daily 6:30, then it is assumed that the time range of space-time affairs d include space-time item collection CSTTime mode.Indicate that the feature of space-time affairs d is included in feature involved in space-time item collection, and space-time affairs d The value q of feature be located in the value interval [1, u] of correlated characteristic in space-time item collection.
In accordance with an embodiment of the present disclosure, operation S253 is based on condition item and destination item constructs space time correlation rule packet It includes:
(1) combination according to each condition item obtains a variety of condition item set.
Assuming that former piece includes n feature, then it can produce 2nThe combination of the feature of-a kind of former piece.For example, condition item packet Include IA、IBAnd IC, it includes an item destination aggregation (mda) that combination, which can be first construction, then an item destination aggregation (mda) are as follows: IA、IBWith IC;Then, construction includes 2 item destination aggregation (mda)s: IAIB、IAICAnd IBIC;Later, construction includes 3 item destination aggregation (mda)s: IAIBIC
(2) under a variety of spatiotemporal modes, building is about the space time correlation of every kind of condition item set and destination item rule.
Wherein, space time correlation rule r is a triple < X → Iy, spattern, tpattern>, X is condition item set, Iy It is destination item, the feature in consequent is not present in former piece.spatternFor the spatial model of space time correlation rule, tpatternFor The time mode of space time correlation rule.
For example, spatternFor Haidian District, tpatternIt is daily 8 points, under the spatiotemporal mode, constructs every kind of condition item collection It closes and the space time correlation of destination item rule, such as < IA→Iy, Haidian District, daily 8 points>,<IAIB→Iy, Haidian District, daily 8 points > etc. space time correlations rule.
In this way, the space time correlation under available each spatiotemporal mode, about every kind of condition item set and destination item Rule.
(3) each space time correlation regular grid DEM and the confidence level under every kind of spatiotemporal mode are determined.
The support support of space time correlation rule r is defined as: in above-mentioned space-time affairs subset D1In space-time affairs branch Hold space-time item collection < X ∪ { Iy, spattern, tpattern> accounting.Wherein, and if only if space-time affairs d and support item collection CSTMeet ( So that l≤q≤u) Shi Chengli, it is believed that space-time affairs d supports item collection CST
The confidence level confidence of space time correlation rule r is defined as: supporting space-time item collection < X, spattern, tpattern> Affairs in, while supporting space-time item collection < X ∪ { Iy, spattern, tpattern> accounting.
In this way, space-time affairs subset D can be passed through1In each space-time affairs d and above-mentioned calculation, when obtaining every kind Each space time correlation regular grid DEM and confidence level under empty mode.
(4) it filters out support and confidence level meets the space time correlation rule of preset condition, as interesting space time correlation Rule.
If the support of space time correlation rule r meets the minimum support threshold value minsup of user preset, confidence level is full The user-defined minimal confidence threshold minconf of foot, then it is assumed that space time correlation rule r is effective, alternatively referred to as space-time Correlation rule r is interesting.
Since the space time correlation rule excavated needs to guarantee that support meets minimum support threshold value, when one it is biggish spatternOr biggish tpatternAll it cannot be guaranteed that when the support for the space-time item collection for being included meets minimum support threshold value, by All lesser s that they are obtainedpatternOr tpatternSupport will not all meet minimum support threshold value, do not need again into Performing check.Unnecessary calculating can be reduced in the above manner, therefore is enumerating the s for calculating space time correlation rulepatternWith tpatternWhen, it enumerates from thick to thin.
In accordance with an embodiment of the present disclosure, it for every kind of spatiotemporal mode, determines each space time correlation regular grid DEM and sets Reliability includes: that each space time correlation rule is constituted distributed elastic data set;By space-time affairs subset D1In each space-time Affairs are broadcast on the slave node of distributed system, each space time correlation regular grid DEM of Distributed Parallel Computing and confidence Degree.
Wherein, for each spatternAnd tpatternCombination, it is assumed that former piece include n feature, then can produce 2n-1 The combination of the feature of kind former piece, calculation amount is bigger, therefore can use the ability of distributed computing, and each space-time of verified in parallel closes Connection rule.Using Distributed Computing Platform Spark, all space time correlation rules under every kind of spatiotemporal mode are constituted into distributed bullet Property data set RDD, and by spatternAnd tpatternSpace-time affairs in combination are broadcast to distributed system from node, using point The ability of cloth parallel computation calculates each space time correlation regular grid DEM and confidence level.Finally can according to dominance relation, Filter out all interested space time correlation rules not dominated.In this way, by each space time correlation rule construct after discretization At elasticity distribution formula data set RDD, using space-time affairs as broadcast variable storage from node, each space-time of distributed computing is closed Join regular grid DEM and confidence level, accelerates to calculate.
In accordance with an embodiment of the present disclosure, operation S253 can also include:
(5) dominance relation of each space time correlation rule in interesting space time correlation rule is determined;
There are dominance relations for space time correlation rule.For the first space time correlation rule r1With the second space time correlation rule r2, such as Fruit meets:
a)
b)
c)So that
D) for r1Consequent < a1, l1, u1> and r2Consequent < a2, l2, u2>, a1=a2,
And
e)support(r1)≥support(r2);
f)confidence(r1)≥confidence(r2)。
That is r1Room and time mode ratio r2It is bigger, r1Former piece about beam ratio r2It is looser, r1Consequent about beam ratio r2More It is compact, and r1Support and confidence level ratio r2It is high, then it is assumed that space time correlation rule r1Dominate space time correlation rule r2
For example, if r1Spatial model be Haidian District, r2Spatial model be grid, then r1Spatial model ratio r2Greatly.If r2Time mode be daily 8 points, r2Time mode be daily daytime, then r1Time mode ratio r2Greatly;If r1Former piece take Being worth range is [2,15], r2Former piece value range be [5,10], r1Former piece value range ratio r2Greatly, then r1Former piece constraint Compare r2It is looser.If r1Consequent value range be [6,7], r2Value range be [2,20], r1Consequent value range ratio r2 Value range it is small, then r1Consequent about beam ratio r2It is more compact.
(6) it filters out from interesting space time correlation rule and is not advised by the space time correlation that other space time correlations rule is dominated Then.
The definition of domination can help to reduce the generation of redundancy rule, the abstract ability of finally obtained space time correlation rule It is stronger, conclusion is more acurrate, more accurate objective result can be obtained by wider range of early stage condition.
Fig. 7 diagrammatically illustrates the process signal of the method for the acquisition space time correlation rule according to another embodiment of the disclosure Figure.
As shown in fig. 7, system automatically extracts the feature of space-time data, and in multiple space-times after space-time data enters system It is polymerize on level, the data of different field, multiple space-time granularities is subjected to space-time connection (Join), generates affairs, side Face subsequent query demand.User needs the specified potential interested factor that may cause air pollution.System can according to The feature of the input at family calculates the minimum public space-time granularity of these features, and in the granularity, inquires the affairs of corresponding granularity, And take out the related data of the feature of user's care.Affairs based on return, system can excavate minimum public space-time granularity with On, the quantization conclusion for leading to air pollution in each space-time granularity, and the conclusion for meeting certain condition is showed into user.
Embodiment of the disclosure additionally provides a kind of device for obtaining space time correlation rule.
Fig. 8 diagrammatically illustrates the block diagram of the device of the acquisition space time correlation rule according to the embodiment of the present disclosure.
As shown in figure 8, the device 800 for obtaining space time correlation rule includes initial module 810, layer module 820, record mould Block 830, characteristic module 840 and rule module 850.
Initial module 810 is used to obtain the initial value of multiple features and each feature at the beginning in empty granularity.
Layer module 820 is used to be based on the initial value, obtains each feature taking in multiple high-level space-time granularities Value, wherein the time granularity of the high-level space-time granularity is coarser than the time granularity of the initial space-time granularity and/or described The spatial granularity of high-level space-time granularity is coarser than the spatial granularity of the initial space-time granularity.
Logging modle 830 generates multiple for the value based on each feature in multiple high-level space-time granularities Space-time affairs.
Characteristic module 840 is used to obtain condition flag and target signature in the multiple feature, and from it is the multiple when The transaction information about the condition flag and target signature is obtained in empty affairs.
Rule module 850 is used for the transaction information based on the condition flag and target signature, creation space time correlation rule.
Specifically, initial module 810 can for example execute operations described above S210, and layer module 820 for example can be with Operations described above S220 is executed, logging modle 830 can for example execute operations described above S230, characteristic module 840 Such as operations described above S240 can be executed, rule module 850 can for example execute operations described above S250, herein It repeats no more.
In accordance with an embodiment of the present disclosure, obtaining initial value of the feature at the beginning in empty granularity includes: to obtain feature first The value in multiple space-time uniques under beginning space-time granularity;Obtaining value of the feature in multiple high-level space-time granularities includes: Obtain value of the feature in multiple space-time uniques under each high-level space-time granularity;Generation space-time affairs include: will be each The value statistics of feature and feature in space-time unique in space-time unique is a space-time affairs.
In accordance with an embodiment of the present disclosure, it obtains from multiple space-time affairs and believes about the affairs of condition flag and target signature Breath comprises determining that the public space-time granularity of the minimum of condition flag and target signature;It is obtained from multiple space-time affairs minimum public The transaction information about condition flag and target signature in synchronic sky granularity.
In accordance with an embodiment of the present disclosure, the transaction information based on condition flag and target signature, creation space time correlation rule It include: that discretization is carried out to target signature based on hierarchical clustering algorithm, is obtained in the case where the quantity of target signature is one Multiple discretization sections of target signature, using each discretization section as a destination item;It is in the quantity of condition flag In the case where multiple, discretization is carried out to multiple condition flags simultaneously using multidimensional tree structure, obtains each condition flag Discretization section, using each discretization section as a condition item;And when being constructed based on condition item and destination item Null Context rule.
It in accordance with an embodiment of the present disclosure, include: according to each based on condition item and destination item building space time correlation rule The combination of condition item obtains a variety of condition item set;Under a variety of spatiotemporal modes, construct about every kind of condition item The space time correlation rule of set and destination item;It determines each space time correlation regular grid DEM under every kind of spatiotemporal mode and sets Reliability;And filter out support and confidence level meets the space time correlation rule of preset condition, it is advised as interesting space time correlation Then.
In accordance with an embodiment of the present disclosure, based on condition item and destination item building space time correlation rule further include: determine The dominance relation of each space time correlation rule in interesting space time correlation rule;It is filtered out not from interesting space time correlation rule The space time correlation rule dominated by other space time correlations rule;Wherein, interesting space time correlation rule includes the first association rule Then with the second correlation rule, if the spatial model of the first correlation rule and time mode are bigger than the second correlation rule, first is associated with Former piece about the second correlation rule of beam ratio of rule is loose, and the consequent of the first correlation rule about the second correlation rule of beam ratio is compact, and The support and confidence level of first correlation rule are higher than the second correlation rule, then it is assumed that the first correlation rule dominates the second association rule Then.
In accordance with an embodiment of the present disclosure, it for every kind of spatiotemporal mode, determines each space time correlation regular grid DEM and sets Reliability includes: that each space time correlation rule is constituted distributed elastic data set;Multiple space-time affairs are broadcast to distributed system On the slave node of system, each space time correlation regular grid DEM of Distributed Parallel Computing and confidence level.
It is module according to an embodiment of the present disclosure, submodule, unit, any number of or in which any more in subelement A at least partly function can be realized in a module.It is single according to the module of the embodiment of the present disclosure, submodule, unit, son Any one or more in member can be split into multiple modules to realize.According to the module of the embodiment of the present disclosure, submodule, Any one or more in unit, subelement can at least be implemented partly as hardware circuit, such as field programmable gate Array (FPGA), programmable logic array (PLA), system on chip, the system on substrate, the system in encapsulation, dedicated integrated electricity Road (ASIC), or can be by the hardware or firmware for any other rational method for integrate or encapsulate to circuit come real Show, or with any one in three kinds of software, hardware and firmware implementations or with wherein any several appropriately combined next reality It is existing.Alternatively, can be at least by part according to one or more of the module of the embodiment of the present disclosure, submodule, unit, subelement Ground is embodied as computer program module, when the computer program module is run, can execute corresponding function.
For example, in initial module 810, layer module 820, logging modle 830, characteristic module 840 and rule module 850 Any number of may be incorporated in a module is realized or any one module therein can be split into multiple modules. Alternatively, at least partly function of one or more modules in these modules can mutually be tied at least partly function of other modules It closes, and is realized in a module.In accordance with an embodiment of the present disclosure, initial module 810, layer module 820, logging modle 830, At least one of characteristic module 840 and rule module 850 can at least be implemented partly as hardware circuit, such as scene can It programs gate array (FPGA), programmable logic array (PLA), system on chip, the system on substrate, the system in encapsulation, dedicated Integrated circuit (ASIC), or can be by carrying out hardware or the firmwares such as any other rational method that is integrated or encapsulating to circuit It realizes, or with any one in three kinds of software, hardware and firmware implementations or with wherein any several appropriately combined To realize.Alternatively, in initial module 810, layer module 820, logging modle 830, characteristic module 840 and rule module 850 At least one can at least be implemented partly as computer program module, can be with when the computer program module is run Execute corresponding function.
Fig. 9 is diagrammatically illustrated according to the computer system for being adapted for carrying out method as described above of the embodiment of the present disclosure Block diagram.Computer system shown in Fig. 9 is only an example, should not function and use scope band to the embodiment of the present disclosure Carry out any restrictions.
As shown in figure 9, include processor 901 according to the computer system 900 of the embodiment of the present disclosure, it can be according to storage It is loaded into random access storage device (RAM) 903 in the program in read-only memory (ROM) 902 or from storage section 908 Program and execute various movements appropriate and processing.Processor 901 for example may include general purpose microprocessor (such as CPU), refer to Enable set processor and/or related chip group and/or special microprocessor (for example, specific integrated circuit (ASIC)), etc..Processing Device 901 can also include the onboard storage device for caching purposes.Processor 901 may include for executing according to disclosure reality Apply single treatment unit either multiple processing units of the different movements of the method flow of example.
In RAM 903, it is stored with system 900 and operates required various programs and data.Processor 901, ROM 902 with And RAM 903 is connected with each other by bus 1304.Processor 901 by execute ROM 902 and/or RAM 903 in program come Execute the various operations of the method flow according to the embodiment of the present disclosure.It is noted that described program also can store except ROM In one or more memories other than 902 and RAM 903.Processor 901 can also by execute be stored in it is one or Program in multiple memories executes the various operations of the method flow according to the embodiment of the present disclosure.
In accordance with an embodiment of the present disclosure, system 900 can also include input/output (I/O) interface 905, input/output (I/O) interface 905 is also connected to bus 904.System 900 can also include be connected to I/O interface 905 with one in lower component Item is multinomial: the importation 906 including keyboard, mouse etc.;Including such as cathode-ray tube (CRT), liquid crystal display (LCD) Deng and loudspeaker etc. output par, c 907;Storage section 908 including hard disk etc.;And including such as LAN card, modulatedemodulate Adjust the communications portion 909 of the network interface card of device etc..Communications portion 909 executes communication process via the network of such as internet. Driver 910 is also connected to I/O interface 905 as needed.Detachable media 911, such as disk, CD, magneto-optic disk, semiconductor Memory etc. is mounted on as needed on driver 910, in order to be pacified as needed from the computer program read thereon It is packed into storage section 908.
In accordance with an embodiment of the present disclosure, computer software journey may be implemented as according to the method flow of the embodiment of the present disclosure Sequence.For example, embodiment of the disclosure includes a kind of computer program product comprising carry meter on a computer-readable medium Calculation machine program, the computer program include the program code for method shown in execution flow chart.In such embodiments, The computer program can be downloaded and installed from network by communications portion 909, and/or be pacified from detachable media 911 Dress.When the computer program is executed by processor 901, the above-mentioned function of limiting in the system of the embodiment of the present disclosure is executed.Root According to embodiment of the disclosure, system as described above, unit, module, unit etc. can by computer program module come It realizes.
The disclosure additionally provides a kind of computer-readable medium, which, which can be in above-described embodiment, retouches Included in the equipment/device/system stated;It is also possible to individualism, and without in the supplying equipment/device/system.On It states computer-readable medium and carries one or more program, when said one or multiple programs are performed, realization is such as The method of the upper acquisition space time correlation rule.
In accordance with an embodiment of the present disclosure, computer-readable medium can be computer-readable signal media or computer can Read storage medium either the two any combination.Computer readable storage medium for example can be --- but it is unlimited In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates The more specific example of machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, portable of one or more conducting wires Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In the disclosure, computer readable storage medium can be it is any include or storage program Tangible medium, which can be commanded execution system, device or device use or in connection.And in this public affairs In opening, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device or device use or program in connection.The journey for including on computer-readable medium Sequence code can transmit with any suitable medium, including but not limited to: wireless, wired, optical cable, radiofrequency signal etc., or Above-mentioned any appropriate combination.
For example, in accordance with an embodiment of the present disclosure, computer-readable medium may include above-described ROM 902 and/or One or more memories other than RAM 903 and/or ROM 902 and RAM 903.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
It will be understood by those skilled in the art that the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations and/or combination, even if such combination or combination are not expressly recited in the disclosure.Particularly, exist In the case where not departing from disclosure spirit or teaching, the feature recorded in each embodiment and/or claim of the disclosure can To carry out multiple combinations and/or combination.All these combinations and/or combination each fall within the scope of the present disclosure.
Embodiment of the disclosure is described above.But the purpose that these embodiments are merely to illustrate that, and It is not intended to limit the scope of the present disclosure.Although respectively describing each embodiment above, but it is not intended that each reality Use cannot be advantageously combined by applying the measure in example.The scope of the present disclosure is defined by the appended claims and the equivalents thereof.It does not take off From the scope of the present disclosure, those skilled in the art can make a variety of alternatives and modifications, these alternatives and modifications should all fall in this Within scope of disclosure.

Claims (10)

1. a kind of method for the acquisition space time correlation rule realized by electronic equipment, comprising:
Obtain the initial value of multiple features and each feature at the beginning in empty granularity;
Based on the initial value, value of each feature in multiple high-level space-time granularities is obtained, wherein the space-time granularity Including time granularity and spatial granularity;The time granularity of the high-level space-time granularity is greater than the time of the initial space-time granularity The spatial granularity of granularity and/or the high-level space-time granularity is greater than the spatial granularity of the initial space-time granularity;
Value based on each feature in multiple high-level space-time granularities, generates multiple space-time affairs;
The condition flag and target signature in the multiple feature are obtained, and is obtained from the multiple space-time affairs about described The transaction information of condition flag and target signature;And
Transaction information based on the condition flag and target signature, creation space time correlation rule.
2. according to the method described in claim 1, wherein:
The initial value that feature is obtained at the beginning in empty granularity include: obtain feature at the beginning multiple under empty granularity when Value in empty range;
The value that feature is obtained in multiple high-level space-time granularities includes: to obtain feature in each high-level space-time granularity Under multiple space-time uniques in value;
The space-time affairs that generate include: the value system by the feature and feature in each space-time unique in the space-time unique It is calculated as a space-time affairs.
3. according to the method described in claim 1, wherein, the acquisition from the multiple space-time affairs is special about the condition The transaction information for target signature of seeking peace includes:
Determine the public space-time granularity of the minimum of the condition flag and the target signature;
From obtained in the multiple space-time affairs in the public space-time granularity of the minimum about the condition flag and described The transaction information of target signature.
4. according to the method described in claim 1, wherein, the transaction information based on the condition flag and target signature, Creating space time correlation rule includes:
In the case where the quantity of the target signature is one, the target signature is carried out based on hierarchical clustering algorithm discrete Change, multiple discretization sections of target signature is obtained, using each discretization section as a destination item;
In the case where the quantity of the condition flag is multiple, multiple condition flags are carried out simultaneously using multidimensional tree structure Discretization obtains the discretization section of each condition flag, using each discretization section as a condition item;And
Based on the condition item and destination item building space time correlation rule.
5. described to construct space-time based on the condition item and the destination item according to the method described in claim 4, wherein Correlation rule includes:
According to the combination of each condition item, a variety of condition item set are obtained;
Under a variety of spatiotemporal modes, constructs and advised about the space time correlation of condition item set described in every kind and the destination item Then;
Determine each space time correlation regular grid DEM and the confidence level under every kind of spatiotemporal mode;And
It filters out support and confidence level meets the space time correlation rule of preset condition, as interesting space time correlation rule.
6. described to construct space-time based on the condition item and the destination item according to the method described in claim 5, wherein Correlation rule further include:
Determine the dominance relation of each space time correlation rule in the interesting space time correlation rule;
The space time correlation rule not dominated by other space time correlations rule is filtered out from the interesting space time correlation rule;
Wherein, the interesting space time correlation rule includes the first correlation rule and the second correlation rule, if the first correlation rule Spatial model and time mode it is bigger than the second correlation rule, the former piece of the first correlation rule about the second correlation rule of beam ratio is wide Pine, the consequent of the first correlation rule about the second correlation rule of beam ratio is compact, and the support of the first correlation rule and confidence level ratio Second correlation rule is high, then it is assumed that the first correlation rule dominates the second correlation rule.
7. according to the method described in claim 5, wherein, for every kind of spatiotemporal mode, determining the branch of each space time correlation rule Degree of holding and confidence level include:
Each space time correlation rule is constituted into distributed elastic data set;
The multiple space-time affairs are broadcast on the slave node of distributed system, each space-time described in Distributed Parallel Computing closes Join regular grid DEM and confidence level.
8. a kind of device for obtaining space time correlation rule, comprising:
Initial module, for obtaining the initial value of multiple features and each feature at the beginning in empty granularity;
Layer module, for obtaining value of each feature in multiple high-level space-time granularities based on the initial value, In, the time granularity of the high-level space-time granularity is coarser than the time granularity of the initial space-time granularity and/or described high-level The spatial granularity of space-time granularity is coarser than the spatial granularity of the initial space-time granularity;
Logging modle generates multiple space-time things for the value based on each feature in multiple high-level space-time granularities Business;
Characteristic module, for obtaining condition flag and target signature in the multiple feature, and from the multiple space-time affairs Transaction information of the middle acquisition about the condition flag and target signature;And
Rule module creates space time correlation rule for the transaction information based on the condition flag and target signature.
9. a kind of electronic equipment, comprising:
One or more processors;
Storage device, for storing one or more programs,
Wherein, when one or more of programs are executed by one or more of processors, so that one or more of Processor realizes such as method according to any one of claims 1 to 7.
10. a kind of computer readable storage medium, is stored thereon with executable instruction, which makes to handle when being executed by processor Device realizes such as method according to any one of claims 1 to 7.
CN201910627281.6A 2019-07-11 2019-07-11 Rule mining method and device, electronic equipment and computer-readable storage medium Active CN110334133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910627281.6A CN110334133B (en) 2019-07-11 2019-07-11 Rule mining method and device, electronic equipment and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910627281.6A CN110334133B (en) 2019-07-11 2019-07-11 Rule mining method and device, electronic equipment and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN110334133A true CN110334133A (en) 2019-10-15
CN110334133B CN110334133B (en) 2020-11-20

Family

ID=68146578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910627281.6A Active CN110334133B (en) 2019-07-11 2019-07-11 Rule mining method and device, electronic equipment and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN110334133B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027644A (en) * 2019-12-26 2020-04-17 湖南大学 Travel mode classification method and device, computer equipment and storage medium
CN111274282A (en) * 2020-01-07 2020-06-12 北京科技大学 Air quality mining system and method and data acquisition monitoring device
CN111695000A (en) * 2020-06-16 2020-09-22 山东蓝海领航大数据发展有限公司 Multi-source big data loading method and system
WO2021121349A1 (en) * 2019-12-19 2021-06-24 Beijing Didi Infinity Technology And Development Co., Ltd. System and method for testing multiple variants
CN113077089A (en) * 2021-04-08 2021-07-06 中山大学 Method and device for evaluating influence of multiple factors on air quality

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332430A1 (en) * 2009-06-30 2010-12-30 Dow Agrosciences Llc Application of machine learning methods for mining association rules in plant and animal data sets containing molecular genetic markers, followed by classification or prediction utilizing features created from these association rules
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
US20160092514A1 (en) * 2014-09-29 2016-03-31 International Business Machines Corporation Mining association rules in the map-reduce framework
CN106779926A (en) * 2016-12-02 2017-05-31 乐视控股(北京)有限公司 Correlation rule generation method, device and terminal
CN108460121A (en) * 2018-01-22 2018-08-28 重庆邮电大学 Space-time data small documents merging method in smart city

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332430A1 (en) * 2009-06-30 2010-12-30 Dow Agrosciences Llc Application of machine learning methods for mining association rules in plant and animal data sets containing molecular genetic markers, followed by classification or prediction utilizing features created from these association rules
CN103700005A (en) * 2013-12-17 2014-04-02 南京信息工程大学 Association-rule recommending method based on self-adaptive multiple minimum supports
US20160092514A1 (en) * 2014-09-29 2016-03-31 International Business Machines Corporation Mining association rules in the map-reduce framework
CN106779926A (en) * 2016-12-02 2017-05-31 乐视控股(北京)有限公司 Correlation rule generation method, device and terminal
CN108460121A (en) * 2018-01-22 2018-08-28 重庆邮电大学 Space-time data small documents merging method in smart city

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021121349A1 (en) * 2019-12-19 2021-06-24 Beijing Didi Infinity Technology And Development Co., Ltd. System and method for testing multiple variants
US11715123B2 (en) 2019-12-19 2023-08-01 Beijing Didi Infinity Technology And Development Co., Ltd. System and method for testing multiple variants
CN111027644A (en) * 2019-12-26 2020-04-17 湖南大学 Travel mode classification method and device, computer equipment and storage medium
CN111027644B (en) * 2019-12-26 2023-12-26 湖南大学 Travel mode classification method, device, computer equipment and storage medium
CN111274282A (en) * 2020-01-07 2020-06-12 北京科技大学 Air quality mining system and method and data acquisition monitoring device
CN111274282B (en) * 2020-01-07 2023-06-23 北京科技大学 Air quality mining system, method and data acquisition monitoring device
CN111695000A (en) * 2020-06-16 2020-09-22 山东蓝海领航大数据发展有限公司 Multi-source big data loading method and system
CN113077089A (en) * 2021-04-08 2021-07-06 中山大学 Method and device for evaluating influence of multiple factors on air quality

Also Published As

Publication number Publication date
CN110334133B (en) 2020-11-20

Similar Documents

Publication Publication Date Title
CN110334133A (en) Rule digging method and device, electronic equipment and computer readable storage medium
Toole et al. The path most traveled: Travel demand estimation using big data resources
Hong et al. Hierarchical community detection and functional area identification with OSM roads and complex graph theory
Yang et al. Scalable space-time trajectory cube for path-finding: A study using big taxi trajectory data
Wang et al. Stop-and-wait: Discover aggregation effect based on private car trajectory data
CN110716935A (en) Track data analysis and visualization method and system based on online taxi appointment travel
Yu et al. Mobile phone data in urban commuting: A network community detection‐based framework to unveil the spatial structure of commuting demand
Zhang et al. Identifying region-wide functions using urban taxicab trajectories
Schoier et al. Spatial data mining for highlighting hotspots in personal navigation routes
Yi et al. Predicting fine-grained air quality based on deep neural networks
Ghosh et al. Locator: A cloud-fog-enabled framework for facilitating efficient location based services
Yao et al. Analysis of key commuting routes based on spatiotemporal trip chain
Kwee et al. Traffic-cascade: Mining and visualizing lifecycles of traffic congestion events using public bus trajectories
Schoier et al. Individual movements and geographical data mining. Clustering algorithms for highlighting hotspots in personal navigation routes
Muñoz-Villamizar et al. Study of urban-traffic congestion based on Google Maps API: the case of Boston
Zheng et al. The prediction of finely-grained spatiotemporal relative human population density distributions in China
CN113673619B (en) Geographic big data space latent pattern analysis method based on topology analysis
Cerqueira et al. Integrative analysis of traffic and situational context data to support urban mobility planning
Honarvar et al. Particular matter prediction using synergy of multiple source urban big data in smart cities
CN115730711A (en) Space division method, device and storage medium
Sun et al. Congestion prediction based on dissipative structure theory: A case study of Chengdu, China
Junior et al. An on-line algorithm for cluster detection of mobile nodes through complex event processing
Wang et al. Detecting latent urban mobility structure using mobile phone data
Shaji Spatio-temporal clustering to study vehicle emissions and air quality correlation at Porto
Liu et al. Estimation of travel flux between urban blocks by combining spatio-temporal and purpose correlation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200728

Address after: Room 806, 8 / F, Zhongguancun International Innovation Building, Haidian District, Beijing 100080

Applicant after: Beijing Jingdong intelligent city big data research institute

Address before: 100086 No.76 Zhichun Road, Haidian District, Beijing, Building No.1, Building No.9, Floor 1-7-5

Applicant before: Jingdong City (Beijing) Digital Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant