CN109033204A - A kind of level integration histogram Visual Inquiry method based on WWW - Google Patents

A kind of level integration histogram Visual Inquiry method based on WWW Download PDF

Info

Publication number
CN109033204A
CN109033204A CN201810698579.1A CN201810698579A CN109033204A CN 109033204 A CN109033204 A CN 109033204A CN 201810698579 A CN201810698579 A CN 201810698579A CN 109033204 A CN109033204 A CN 109033204A
Authority
CN
China
Prior art keywords
data
histogram
integration histogram
tree
statistical nature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810698579.1A
Other languages
Chinese (zh)
Other versions
CN109033204B (en
Inventor
陈为
梅鸿辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201810698579.1A priority Critical patent/CN109033204B/en
Publication of CN109033204A publication Critical patent/CN109033204A/en
Application granted granted Critical
Publication of CN109033204B publication Critical patent/CN109033204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of Visual Inquiry methods of level integration histogram, comprising the following steps: step 1: configuring to raw data set, including discretization interval number, crosses the condition of filter data and need to carry out the dimension of aggregate statistics;Step 2: with the building of offline pretreatment mode and storage hierarchy partition tree, wherein data are divided into multiple data subsets by level partition tree, and the statistical nature of each data subset is expressed by integration histogram;Step 3: visualization space uniform is discretized into specific zonule, distinguishing hierarchy tree in the coordinate input step 2 of zonule is subjected to range query, the range query is the process found the data subset for having intersection with target area and remove estimation target area statistical nature with the integration histogram of the intersection, and all zonules obtain a matrix about statistical nature after being all performed range query;Step 4: visual element binding being carried out to the matrix of statistical nature, carries out visualization request.

Description

A kind of level integration histogram Visual Inquiry method based on WWW
Technical field
The present invention relates to rapid visual inquiry field, in particular to a kind of method for quickly querying of level integration histogram.
Background technique
In the visual analysis scene of large-scale structure data, people need to understand and grind from the statistical nature of data The distribution for studying carefully data by characteristic distributions Rule Summary, carries out decision.The most common aminated polyepichlorohydrin (refers to and calculates from a class value A value out) visual representation is generally carried out by histogram or discretization scatter plot etc..When data volume is sufficiently large, directly traverse The method of data item counting statistics feature will be unable to the real-time demand for meeting Interactive Visualization exploration.How to tie on a large scale Quick search obtains the data of specified range in structure data, for example, traffic resource real-time management scheduling, financial transaction reality When monitoring etc., become internet, traffic, space flight, the heat subject in the fields such as business.
For the large-scale structure data in reality, dimension is high, and data item is more, and data modality and format are a variety of more Sample, data distribution are unique.Visual Inquiry operation is executed on the data set of such bulky complex, can exist can not timely respond to very To time-consuming too long problem.Many existing methods are all to carry out query optimization in database level, they are accurate in order to obtain Result, need to be arranged on data set to consider to be configured with simultaneously and external be expressed conducive to what user understood;In addition, a few thing base In the target of approximation use a series of approximate query strategy (approximate query refers to reduce the response time of inquiry, Data are inquired with approximate strategy), it is being reached based on histogram table and be based on wavelet transformation such as based on sampling algorithm Technology.
Above-mentioned approximation technique some has used fixed precomputation mode, is confined to certain statistical feature, may not apply to A plurality of types of data, such as dynamic data and flow data;Some is only limitted to low-dimensional situation, and High Dimensional Data Set calculates required memory It is excessive.
Summary of the invention
The present invention provides a kind of Visual Inquiry methods of level integration histogram, and search time is reduced to 500 milliseconds Within, reach interaction level while substantially reducing the demand to storage.
A kind of Visual Inquiry method of level integration histogram, comprising the following steps:
Step 1: raw data set being configured, including discretization interval number, the condition of filter data is crossed and needs to carry out The dimension of aggregate statistics;
Step 2: the data handled based on the configuration in step 1, with the building of offline pretreatment mode and storage hierarchy Partition tree, wherein data are divided into multiple data subsets by level partition tree, and the statistical nature of each data subset is straight by integrating Square figure is expressed;
Step 3: specific zonule being discretized into for space uniform is visualized by the configuration in step 1, for each Distinguishing hierarchy tree in the coordinate input step 2 of zonule is carried out range query by block zonule, and the range query is to find There is the data subset of intersection with target area and go the process of estimation target area statistical nature with the integration histogram of the intersection, All zonules obtain a matrix about statistical nature after being all performed range query;
Step 4: visual element binding being carried out to the matrix of the statistical nature of step 3, carries out visualization request.
Time loss is transferred to pretreatment stage by this method, and the approximate meter in allowable range of error is carried out to query result It calculates, compared with the conventional method, carrying cost, and the quantity of time complexity and data point can be significantly reduced in this querying method It is unrelated, efficient online Visual Inquiry can be carried out.
The present invention is based on the configuration parameters of user to pre-process raw data set and target visible space, and passes through layer Secondary partitioning algorithm carries out distinguishing hierarchy to data set, to realize the table for using different accuracy and scale to the region of different distributions It reaches.For each sub-regions, the statistical nature in the approximate region is removed with integration histogram, in Visual Inquiry, system is utilized Distinguishing hierarchy tree, which fast and effeciently traverses, searches target area collection merging return approximation, to obtain the approximate system of target area Count feature.
Compared with the existing methods, time loss is transferred to data preprocessing phase by this method, visual to that may need The offline pretreatment in advance of the data set of inquiry, obtains a kind of efficient approximate expression to data set, so can be used for it is subsequent Line Visual Inquiry.This method is based on the conception that gradually refines again of approximation, it is only necessary to storing data counted after integration histogram, Other many Visual Inquiry methods need to store initial data, need biggish time and space loss, while cannot be preferable Ground captures the distribution of data, therefore the application of this method is wider.
In order to improve the scope of application and intelligence of the invention, it is preferred that further include that interactively adjusting can in step 4 Depending on the parameter changed, and instant visible feedback result is obtained in visualization process.
In order to further increase computational efficiency, it is preferred that in step 2, raw data set is with n dimension D= {D1,…,DnHigh Dimensional Data Set V, the domain of each dimension is expressed as { [a1,b1],…,[an,bn]}。
In order to further increase computational efficiency, it is preferred that in step 2, data are divided into multiple data by level partition tree Subset detailed process are as follows: entire data space is subjected to recurrence division, generates the tree construction of a layering, data space is reconstructed For V '={ v '1,…v′i…v′p, wherein each v 'i∈ V ' corresponds to a leaf node for tree.
In order to further increase computational efficiency, it is preferred that in step 2, the integration histogram is a kind of expansion of summation table It opens up, the value of each grid is equal to the summation of all values in its upper left corner in table, and then the value in each grid can be by four The plus-minus of a value obtains.The English name of summation table is a two-dimensional table summed area table.
In order to further increase computational efficiency, it is preferred that in step 2, calculating integration histogram, detailed process is as follows: right In by N1×…NdGrid carries out the d dimension data collection of branch mailbox, and is summarized by the histogram with b branch mailbox number, leaf node Integration histogram is defined as:
Wherein, x1,…,xdIt is the index of the branch mailbox in d dimension, b is the index of branch mailbox in histogram, h (x1,…, xd) indicate the histogram of each grid intermediate value;
The integration histogram of any rectangular area can be calculated by following manner in data space:
Wherein xpIt is the angle point of rectangular area, p ∈ { 0,1 }d
Beneficial effects of the present invention:
The Visual Inquiry method of level integration histogram of the invention is realized and uses different accuracy to the region of different distributions And scale expression mode, the approximate statistical feature of target area is obtained, time loss is transferred to data preprocessing phase, to possible The data set offline pretreatment in advance for needing Visual Inquiry, obtains a kind of efficient approximate expression to data set, again based on approximation The conception gradually refined, it is only necessary to storing data counted after integration histogram, reduce time and space loss, while preferably Ground captures the distribution of data, and application is wider.
Detailed description of the invention
Fig. 1 is the flow diagram of the Visual Inquiry method of level integration histogram of the invention.
Fig. 2 is that the POI data collection on map is divided into the result schematic diagram after multiple subsets by level partition tree.
Fig. 3 is the amplified result schematic diagram of close quarters of Fig. 2.
Specific embodiment
As shown in Figure 1, the Visual Inquiry method of the level integration histogram of the present embodiment the following steps are included:
Step 1: there is n dimension D={ D for one1,…,DnHigh Dimensional Data Set V, each of which dimension domain difference It is expressed as { [a1,b1],…,[an,bn], term branch mailbox is a kind of user-defined scale bar for aggregated data space, is used The dimension for carrying out branch mailbox, filtering and polymerization is specified at family from High Dimensional Data Set, as shown in figure 1 shown in wire frame a.
Step 2: the data handled based on the configuration in step 1, system use the space partitioning algorithm of R tree first, Entire data space is carried out recurrence division by the variant R* tree that R tree is used in the present embodiment detailed process, to generate one The tree construction of layering, as shown in figure 1 shown in wire frame b, as shown in Figures 2 and 3, Fig. 3 can see close quarters and be divided into more Subspace, and division result is preferable.Data space is reconfigured as V '={ v '1,…v′p, wherein each v 'i∈ V ' is corresponding In a leaf node for R tree.Then integration histogram is calculated on all leaf nodes, as shown in figure 1 shown in wire frame c, it is to ask With a kind of extension of table, the English name of table of summing is a two-dimensional table summed area table, each in table The value of grid is equal to the summation of all values in its upper left corner, and then the value in each grid can be obtained by the plus-minus of four values ?.
Store that single scalar value is different, and integration histogram summarizes falls in each net in each grid from original summation table The distribution of data point in lattice calculates the histogram of all data points on leaf node within the scope of each grid, and is similar to logical The mode for crossing summation meter calculation rectangle region thresholding returns to the result of inquiry.
For by N1×…NdGrid carries out the d dimension data collection of branch mailbox, and is carried out by the histogram with b branch mailbox number Summarize, the integration histogram of leaf node is defined as:
Wherein, x1,…,xdIt is the index of the branch mailbox in d dimension, b is the index of branch mailbox in histogram, h (x1,…, xd) indicate the histogram of each grid intermediate value, so in data space any rectangular area integration histogram can by with Under type calculates:
Wherein xpIt is the angle point of rectangular area, p ∈ { 0,1 }d
Step 3: user defines a query contextWith an aggregate function A, two Person forms an aggregate query, is expressed as Q (R, A), which is located at range for polymerization respectivelyInterior data point.
Obtained the range of each query region, can by the integration histogram in step 2 in constant time to every The value in a branch mailbox region is inquired, and as shown in figure 1 shown in wire frame e, and a result histogram is returned to, so as to estimate approximation Polymerization result.
Step 4: after obtaining approximate polymerization result, user can carry out some visualized operation requests to it, as shown in figure 1 line Shown in frame d, and since it is expected that the level integration histogram calculated is stored in memory, so visualization and aggregate query Building all executes online, and user can interactively adjust visual parameter, and obtains immediately in visualization process Visible feedback result.

Claims (6)

1. a kind of Visual Inquiry method of level integration histogram, which comprises the following steps:
Step 1: raw data set being configured, including discretization interval number, the condition of filter data is crossed and is polymerize The dimension of statistics;
Step 2: the data handled based on the configuration in step 1, with the building of offline pretreatment mode and storage hierarchy divides Tree, wherein data are divided into multiple data subsets by level partition tree, and the statistical nature of each data subset is by integration histogram It is expressed;
Step 3: specific zonule is discretized into for space uniform is visualized by the configuration in step 1, it is small for each piece Distinguishing hierarchy tree in the coordinate input step 2 of zonule is carried out range query by region, and the range query is searching and mesh There is the data subset of intersection in mark region and goes the process of estimation target area statistical nature with the integration histogram of the intersection, owns Zonule obtains a matrix about statistical nature after being all performed range query;
Step 4: visual element binding being carried out to the matrix of the statistical nature of step 3, carries out visualization request.
2. the Visual Inquiry method of level integration histogram as described in claim 1, which is characterized in that in step 4, further include Visual parameter is interactively adjusted, and obtains instant visible feedback result in visualization process.
3. the Visual Inquiry method of level integration histogram as described in claim 1, which is characterized in that in step 2, original number According to collection for n dimension D={ D1..., DnHigh Dimensional Data Set V, the domain of each dimension is expressed as { [a1, b1] ..., [an, bn]}。
4. the Visual Inquiry method of level integration histogram as described in claim 1, which is characterized in that in step 2, data quilt Distinguishing hierarchy tree is divided into multiple data subset detailed processes are as follows: entire data space is carried out recurrence division, generates one point The tree construction of layer, data space are reconfigured as V '={ v '1... v 'i...v′p, wherein each v 'i∈ V ' corresponds to tree One leaf node.
5. the Visual Inquiry method of level integration histogram as claimed in claim 4, which is characterized in that in step 2, the product Dividing histogram is a kind of extension of summation table, and the value of each grid is equal to the summation of all values in its upper left corner in table, in It is that value in each grid can be obtained by the plus-minus of four values.
6. the Visual Inquiry method of level integration histogram as claimed in claim 5, which is characterized in that in step 2, calculate product Dividing histogram, detailed process is as follows: for by N1×...NdGrid carries out the d dimension data collection of branch mailbox, and by with b points The histogram of case number is summarized, the integration histogram of leaf node is defined as:
Wherein, x1..., xdIt is the index of the branch mailbox in d dimension, b is the index of branch mailbox in histogram, h (x1..., xd) Indicate the histogram of each grid intermediate value;
The integration histogram of any rectangular area can be calculated by following manner in data space:
Wherein xpIt is the angle point of rectangular area, p ∈ { 0,1 }d
CN201810698579.1A 2018-06-29 2018-06-29 Hierarchical integral histogram visual query method based on world wide web Active CN109033204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810698579.1A CN109033204B (en) 2018-06-29 2018-06-29 Hierarchical integral histogram visual query method based on world wide web

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810698579.1A CN109033204B (en) 2018-06-29 2018-06-29 Hierarchical integral histogram visual query method based on world wide web

Publications (2)

Publication Number Publication Date
CN109033204A true CN109033204A (en) 2018-12-18
CN109033204B CN109033204B (en) 2021-10-08

Family

ID=65522033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810698579.1A Active CN109033204B (en) 2018-06-29 2018-06-29 Hierarchical integral histogram visual query method based on world wide web

Country Status (1)

Country Link
CN (1) CN109033204B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038672A (en) * 2007-04-30 2007-09-19 北京中星微电子有限公司 Image tracking method and system thereof
CN101308544A (en) * 2008-07-11 2008-11-19 中国科学院地理科学与资源研究所 Spatial heterogeneity mode recognition method and layering method based on grids
CN101329767A (en) * 2008-07-11 2008-12-24 西安交通大学 Method for automatically detecting obvious object sequence in video based on learning
CN102324102A (en) * 2011-10-08 2012-01-18 北京航空航天大学 Method for automatically filling structure information and texture information of hole area of image scene
CN102855317A (en) * 2012-08-31 2013-01-02 王晖 Multimode indexing method and system based on demonstration video
CN103793467A (en) * 2013-09-10 2014-05-14 浙江鸿程计算机系统有限公司 Method for optimizing real-time query on big data on basis of hyper-graphs and dynamic programming
WO2014149115A2 (en) * 2013-02-25 2014-09-25 Raytheon Company Reduction of cfar false alarms via classification and segmentation of sar image clutter
CN104167013A (en) * 2014-08-04 2014-11-26 清华大学 Volume rendering method for highlighting target area in volume data
CN104361357A (en) * 2014-11-07 2015-02-18 北京途迹科技有限公司 Photo set classification system and method based on picture content analysis
CN106127808A (en) * 2016-06-20 2016-11-16 浙江工业大学 A kind of block particle filter method for tracking target based on color and the anti-of local binary patterns Feature Fusion
CN106780544A (en) * 2015-11-18 2017-05-31 深圳中兴力维技术有限公司 The method and apparatus that display foreground is extracted
CN106815320A (en) * 2016-12-27 2017-06-09 华南师范大学 Based on the investigation big data visual modeling method and system of expanding stereogram
CN107240118A (en) * 2017-05-19 2017-10-10 成都信息工程大学 One kind is based on the histogrammic discriminate tracking of RGB color
CN107507222A (en) * 2016-06-13 2017-12-22 浙江工业大学 A kind of anti-particle filter method for tracking target based on integration histogram blocked

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038672A (en) * 2007-04-30 2007-09-19 北京中星微电子有限公司 Image tracking method and system thereof
CN101308544A (en) * 2008-07-11 2008-11-19 中国科学院地理科学与资源研究所 Spatial heterogeneity mode recognition method and layering method based on grids
CN101329767A (en) * 2008-07-11 2008-12-24 西安交通大学 Method for automatically detecting obvious object sequence in video based on learning
CN102324102A (en) * 2011-10-08 2012-01-18 北京航空航天大学 Method for automatically filling structure information and texture information of hole area of image scene
CN102855317A (en) * 2012-08-31 2013-01-02 王晖 Multimode indexing method and system based on demonstration video
WO2014149115A2 (en) * 2013-02-25 2014-09-25 Raytheon Company Reduction of cfar false alarms via classification and segmentation of sar image clutter
CN103793467A (en) * 2013-09-10 2014-05-14 浙江鸿程计算机系统有限公司 Method for optimizing real-time query on big data on basis of hyper-graphs and dynamic programming
CN104167013A (en) * 2014-08-04 2014-11-26 清华大学 Volume rendering method for highlighting target area in volume data
CN104361357A (en) * 2014-11-07 2015-02-18 北京途迹科技有限公司 Photo set classification system and method based on picture content analysis
CN106780544A (en) * 2015-11-18 2017-05-31 深圳中兴力维技术有限公司 The method and apparatus that display foreground is extracted
CN107507222A (en) * 2016-06-13 2017-12-22 浙江工业大学 A kind of anti-particle filter method for tracking target based on integration histogram blocked
CN106127808A (en) * 2016-06-20 2016-11-16 浙江工业大学 A kind of block particle filter method for tracking target based on color and the anti-of local binary patterns Feature Fusion
CN106815320A (en) * 2016-12-27 2017-06-09 华南师范大学 Based on the investigation big data visual modeling method and system of expanding stereogram
CN107240118A (en) * 2017-05-19 2017-10-10 成都信息工程大学 One kind is based on the histogrammic discriminate tracking of RGB color

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
周芳芳等: "基于维度扩展的Radviz可视化聚类分析方法", 《软件学报》 *

Also Published As

Publication number Publication date
CN109033204B (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN104820905B (en) Personnel's management-control method and system based on space tracking big data analysis
CN103491187B (en) A kind of big data united analysis processing method based on cloud computing
CN106202335B (en) A kind of traffic big data cleaning method based on cloud computing framework
Ding et al. Enabling smart transportation systems: A parallel spatio-temporal database approach
US12099907B2 (en) Automated geospatial data analysis
CN105404634B (en) Data managing method and system based on Key-Value data block
CN106547882A (en) A kind of real-time processing method and system of big data of marketing in intelligent grid
WO2015041735A1 (en) Systems and methods for interest-driven business intelligence systems including geo-spatial data
CN109299298A (en) Construction method, device, application method and the system of image fusion model
CN108846338A (en) Polarization characteristic selection and classification method based on object-oriented random forest
CN107247799A (en) Data processing method, system and its modeling method of compatible a variety of big data storages
Camara et al. Fields as a generic data type for big spatial data
CN107408114A (en) Based on transactions access pattern-recognition connection relation
US20150081353A1 (en) Systems and Methods for Interest-Driven Business Intelligence Systems Including Segment Data
CN103631922A (en) Hadoop cluster-based large-scale Web information extraction method and system
CN103970871A (en) Method and system for inquiring file metadata in storage system based on provenance information
CN103177035A (en) Data query device and data query method in data base
CN103412903A (en) Method and system for interested object prediction based real-time search of Internet of Things
CN113626437A (en) Method and system for rapidly inquiring mass vector data
CN112100130A (en) Massive remote sensing variable multi-dimensional aggregation information calculation method based on data cube model
CN109583712B (en) Data index analysis method and device and storage medium
CN117194600A (en) Service-oriented geographic entity assembling method and system
CN109033204A (en) A kind of level integration histogram Visual Inquiry method based on WWW
Dzikrullah et al. Implementation of scalable k-means++ clustering for passengers temporal pattern analysis in public transportation system (BRT Trans Jogja case study)
Sayed et al. A conceptual framework for using big data in Egyptian agriculture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant