CN110175191A - Data filtering rule modeling method in data analysis - Google Patents

Data filtering rule modeling method in data analysis Download PDF

Info

Publication number
CN110175191A
CN110175191A CN201910401717.XA CN201910401717A CN110175191A CN 110175191 A CN110175191 A CN 110175191A CN 201910401717 A CN201910401717 A CN 201910401717A CN 110175191 A CN110175191 A CN 110175191A
Authority
CN
China
Prior art keywords
data
column
analysis
type
cnt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910401717.XA
Other languages
Chinese (zh)
Other versions
CN110175191B (en
Inventor
周鹏程
荆一楠
何震瀛
王晓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201910401717.XA priority Critical patent/CN110175191B/en
Publication of CN110175191A publication Critical patent/CN110175191A/en
Application granted granted Critical
Publication of CN110175191B publication Critical patent/CN110175191B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Data filtering rule modeling method the invention belongs to data analysis technique field, in specially a kind of data analysis.Data filtering rule modeling method of the invention mainly includes three parts: (1) data column analysis filtering (2) data area analysis filtering (3) result set automatic visual.The present invention, which passes through, reasonably sets relevant rule solves how to apply the foundation analysis filtering model of data filtering rule in data analysis, crosses filter data and intuitive display data using model analysis.The present invention can facilitate the quick garbled data of user and find interested data subset, contact between analysis and mining data item.

Description

Data filtering rule modeling method in data analysis
Technical field
The invention belongs to data analysis technique fields, and in particular to the data filtering rule modeling method in data analysis.
Background technique
In the data ubiquitous epoch, the decision of user is increasingly by the driving of data.It is analyzed typically for data As a result difference tends to significantly affect decision process.Select improper data, it is either intentional still unintentionally, may cause The decision of mistake, misleading or " fragility ".For the user for having no data analysis experience particularly with data analysis, these are bad The result of data analysis may result in serious economic loss.So guidance user carries out good data selection energy band to use The data investigative analysis of family better quality is experienced.
In order to enable the user of no data analysis experience to eliminate as much as the Data Mining process of error and numerous of being easy Trivial analysis filter condition setting, it is flat-footed to obtain good data analysis filter effect.There is no doubt that we need A standardized process is wanted to determine how this carries out the selection of the filter analysis of data, how to be automated according to the feature of data Carry out data filtering rule modeling.
Summary of the invention
The scene that the purpose of the present invention is explore for interactive data provides a kind of data filtering rule modeling method, Quickly to carry out analysis mining for the data on data set, facilitate exploration and analysis of the user for data.
For the recommendation rules modeling on data set, our desired characteristics are as follows:
1. interpretation: how suitably to generate recommendation inside a visualization system;
2. feasibility: generating and recommend should have enough analysis significances, it would be desirable to be able to excavate the potential association between data;
3. qualitative: the building of the characteristic explored due to user, model has high efficiency, robustness.
Data filtering rule modeling method provided by the invention, the specific steps are as follows:
(1) give whether the data set D being made of mass data is referred to using the method for random forest feature selecting according to user Determine critical data, calculates the different degree of data column.Detailed process is as follows:
(1.1) prominence score (variable importance measures), is indicated with VIM, by Gini index GI To indicate, it is assumed that there is m data column X1, X2, X3..., Xm, it is now to calculate each column XjGini index score VIMj (Gini), that is, it is listed in the average knots modification of all decision tree interior joint division impurity levels of random forest (RF) for j-th;Wherein Gini Index:
Wherein, K indicates that m node has K classification, p in all decision trees of RFmkIndicate ratio shared by classification k in node m, pmk′Indicate the complement value of ratio shared by classification k in node m;It intuitively, is exactly that two samples are at will randomly selected from node m This, the inconsistent probability of category label.
(1.2) data column XjIn the importance of node m, i.e., the Gini index variation amount before and after node m branch isWithRespectively indicate the Gini index of latter two new node of branch.
(1.3) data column XjThe node occurred in decision tree i is in set M, then XjIn the importance that i-th is set are as follows:
(1.4) n tree is shared inside random forest, then data column XjImportance are as follows:
(1.5) according to the sequence for calculating importance, returning to customer analysis filter result is most important two column data, note Importance ranking for A, B, A is higher than B.
(2) data area analysis filtering.The present invention illustrates how that carrying out data area analysis filters in the case of the column of A, B two, Detailed process is as follows:
(2.1) present invention is divided into three classes according to two column data type of A, B first: numeric type N, discrete value type X, timing type T;For Numeric type N, can do sliding-model control first, and specific practice is data are carried out with branch mailbox to handle to obtain each chest to be denoted as n ', count The counting for calculating each branch mailbox is denoted as CNT (n ');For discrete value type X, the counting for calculating each discrete value is denoted as CNT (x);
Since timing type data often have the feature of season property, the present invention can be automatically according to the time series data model of data column T It encloses and divides time slice case, data column T handles to obtain each timing case by branch mailbox is denoted as t ';Such as: the data area 2017 of T - 2019 years years, then timing case t ' was divided as unit of year, and the data area of T is only data in 2019, then timing case t ' is with the moon For unit division;The data area for similarly arranging T is only in January, 2019, then timing case t ' is divided as unit of day.
(2.2) two kinds of data are formed according to three different data types and analyzes filtration combination model, data set D is carried out Data filtering analyzes (wherein all "/" meanings are "or", are not expressed as division);Specifically:
(2.2.1) A is timing type data, and B is discrete value type or numeric type;The unit choosing for the timing case t ' that A is obtained according to (2.1) Take the proximal segment time appropriate as first filter condition trecent(such as: nearest 3 years, six months nearest, seven days nearest, no It is sufficient then do not generate this filtering);Data set after the conditional filtering of A column is D*, dispersion number is obtained by filtration in data column B According to column B*X1 *, x2 *..., xk *Or numeric data column B*(n will be obtained by branch mailbox again1 *) ', (n2 *) ' ..., (nk *) ', wherein Chest quantity is k, with x*/ (n*) ' in the maximum three value CNT (x of counting*)top3/ CNT ((n*) ') top3Three of place from Dissipate data xmax *Or case (nmax *) ' numberical range as second filter condition;With two filter condition trecentAnd xmax */ (nmax *) ' intersection trecent∩xmax */ (nmax *) ' as analysis filtration combination model analysis filter condition, to data set D into Row data filter analysis;
(2.2.2) A is discrete value type or numeric type, and B is timing type data;A calculate the CNT (x) of each discrete value amount or case/ CNT (n ') chooses and counts five most constant xtop5Or case (ntop5) ' (discrete value or box number deficiency will not then generate this Filtering) corresponding numberical range is as first filter condition;Data set after the conditional filtering of A column is D*;It chooses in A Count most constant xmaxOr case (nmax) ' corresponding data column B*Timing range tmaxAs second filter condition;With Two filter condition xtop5/(ntop5) ' and tmaxIntersection xtop5/(ntop5)′∩tmaxAnalysis as analysis filtration combination model Filter condition carries out data filtering analysis to data set D.
(3) in order to be presented to the user the data filtered by analysis, the present invention will pass through step (1), (2) two-step analysis The result data collection being obtained by filtration automatically visualizes.Detailed process is as follows:
(3.1) result data collection is visualized to obtain the base value d (X) of column X, arranges the maximum value max (X) of X, minimum value min (X), the record strip number of X is arranged | X |, arrange the data type type(X of X) and, arrange the counting CNT of each corresponding x ' of case data x ' of X (x ') (each discrete value of discrete value column X can regard a case as), the phase of each case data x ' corresponding counting CNT (x ') Relationship number correlation (x, CNT (x ')).
(3.2) the column type type(X according to obtained in (3.1)) define a set of shearing rule;When the data type of column x It can be histogram, line chart for timing type: Visual Chart;When the data type of column x is discrete value type or numeric type: visualization Chart can be histogram, cake chart, scatter plot.
(3.3) present invention proposes that a kind of data analysing method-Relative Entropy filters to determine from step (1), (2) analysis The result data collection obtained afterwards the visualization how to automate;The core concept of this method calculates each data column X visualization For ratio of the comentropy relative to standardized chart-information entropy of various charts, it is denoted as C(X)1, C(X)2..., C(X)k;Than The size of more each Relative Entropy, maximum value C(X)maxCorresponding subtype is exactly the visualization types of data column X.Specifically Way is as follows:
(3.3.1) column diagram is most commonly used one of the chart of analyst, and the difference in height of pillar is using raising user for data The identification of difference;Column diagram is suitable for each scene, can preferably show when x ' element (i.e. the number of case) is more The details of data;The Relative Entropy for calculating histogram uses the base value d (X) of column X, | d (X) | indicate the radix d of column X (X) value;
(3.3.2) pie chart can show multi-group data, and performance each group of data accounts for always than situation;We need differentiation in cake chart The CNT(x ' of degree) highlight the accounting of each section, Shannon entropy is introduced thus:, make For the part of criterion;Wherein y indicates each value of CNT (x'), and P (y) indicates the quantity accounting value of y, i.e. y is at CNT (x') Probability of happening;
The advantage of (3.3.3) line chart can reflect the case where development and change of the same thing in different time;As data CNT When (x ') and x ' meet certain distribution (such as: linear distribution, exponential distribution, log series model, low order power are distributed), the expression of distribution Formula is denoted as distribution (x ', CNT(x ')), comentropy C(X) it is 1;Otherwise, comentropy C(X) it is 0;
C(X)=distribution (x ', CNT(x '));
(3.3.4) scatter plot indicates the relationship between two variables by reference axis;Use related coefficient correlation (x ', CNT (x ')) is calculated;
C(X)=correlation (x ', CNT (x ')).
(3.4) relative information Entropy sequence is obtained under various Visual Charts by comparing column X, obtain Relative Entropy most Big value C(X)max.(1) the result data collection obtained after (2) analysis filtering will use C(X)maxCorresponding subtype carries out visual Change shows.
The present invention, which passes through, reasonably sets relevant rule solves how to build in data analysis using data filtering rule Vertical analysis filtering model, crosses filter data and intuitive display data using model analysis.The present invention can facilitate user quickly to screen Data simultaneously find interested data subset, contact between analysis and mining data item.
Detailed description of the invention
Fig. 1 is data column analysis example diagram.
Fig. 2 is the process of data analysis filtering.
Fig. 3 is the example of data analysis filtering.Wherein, it is price filtering example that (a), which is sales date filtering example figure (b), Figure.
Fig. 4 is result data collection visual means comparison diagram.Wherein, (a) is that result data collection histogram shows that (b) is knot Fruit data set line chart is shown.
Fig. 5 is the method for the present invention process diagram.
Specific embodiment
We introduce the present invention by a specific data analysis system in this section.
The data that the present invention selects include 33 column, 344355 data.Process as described above is operated, analysis The data visualization that analysis obtains simultaneously is returned to user's displaying by data column and data area later.It is illustrated in fig. 1 shown below, the present invention Data column analysis method is arranged using profit and analyzes remaining all data column as key column, and analysis result is sales date and price The importance highest of two column.
The present invention is based on the schemes that (2) provide to establish data filtering rule model, to target column sales date and price into The combination of row screening conditions, data analysis system obtain the behaviour that analysis data are illustrated in fig. 2 shown below based on data filtering rule model Make sequence, obtaining the sales date is nearest one month, the maximum case data area 0-57 of price.It finally obtains as shown in Figure 3 Filter result system example show.
The visual form of the automation that the present invention uses.Therefore the autonomous analysis result data collection of meeting, with appropriate visual Change chart to show result data collection.It is illustrated in fig. 4 shown below, is just less closed shown in left figure using data as histogram displaying It is suitable, and data visualization is turned into right figure line chart, trend just is better seen than being visualized as histogram.Therefore, the present invention uses The line chart display data column price on the right.

Claims (1)

1. the data filtering rule modeling method in a kind of data analysis, the specific steps are as follows:
(1) data set being made of mass data is givenD, using the method for random forest feature selecting, whether referred to according to user Determine critical data, calculates the different degree of data column;Detailed process is as follows:
(1.1) prominence score is indicated with VIM;Gini index is indicated with GI, it is assumed that have m data column X1, X2, X3..., Xm, to calculate each column XjGini index score VIMj (Gini), that is, it is all to be listed in random forest (RF) for j-th The average knots modification of decision tree interior joint division impurity level;Gini index are as follows:
Wherein, K indicates that m node has K classification, p in all decision trees of RFmkIndicate ratio shared by classification k, p in node mmk′ Indicate the complement value of ratio shared by classification k in node m;
(1.2) data column XjGini index variation amount in the importance of node m, i.e., before and after node m branch are as follows:
WithRespectively indicate the Gini index of latter two new node of branch;
(1.3) data column XjThe node occurred in decision tree i is in set M, then XjIn the importance that i-th is set are as follows:
(1.4) n tree is shared inside random forest, then data column XjImportance are as follows:
(1.5) according to importance ranking is calculated, returning to customer analysis filter result is most important two column data, is denoted as A, B, The importance ranking of A is higher than B;
(2) data area analysis filtering;Detailed process is as follows:
(2.1) it is divided into three classes first according to two column data type of A, B: numeric type N, discrete value type X, timing type T;For numerical value Type N, does sliding-model control first, and specific practice is data are carried out with branch mailbox to handle to obtain each chest to be denoted as n ', calculates each The counting of branch mailbox is denoted as CNT (n ');For discrete value type X, the counting for calculating each discrete value is denoted as CNT (x);
Timing type T divides time slice case according to the time series data range of data column T, and data column T handles to obtain every by branch mailbox A timing case is denoted as t ';
(2.2) two kinds of data are formed according to three different data types and analyzes filtration combination mode, data are carried out to data set D Filter analysis;Specifically:
(2.2.1) A is timing type data, and B is discrete value type or numeric type;The unit choosing for the timing case t ' that A is obtained according to (2.1) Take the proximal segment time appropriate as first filter condition trecent;Data set after the conditional filtering of A column is denoted as D*, Discrete data column B is obtained by filtration in data column B*X1 *, x2 *..., xk *Or numeric data column B*It branch mailbox will obtain again (n1 *) ', (n2 *) ' ..., (nk *Wherein chest quantity is k to) ', with x*/ (n*) ' in the maximum three value CNT of counting (x*)top3/ CNT ((n*) ') top3Three discrete data x at placemax *Or case (nmax *) ' numberical range as second filter Condition;With two filter condition trecentAnd xmax */ (nmax *) ' intersection trecent∩xmax */ (nmax *) ' as analysis filtering group The analysis filter condition of molding type carries out data filtering analysis to data set D;
(2.2.2) A is discrete value type or numeric type, and B is timing type data;A calculate the CNT (x) of each discrete value amount or case/ CNT (n ') chooses and counts five most constant xtop5Or case (ntop5) ' corresponding numberical range is as first filter condition; Data set after the conditional filtering of A column is D*;It chooses in A and counts most constant xmaxOr case (nmax) ' corresponding number According to column B*Timing range tmaxAs second filter condition;With two filter condition xtop5/(ntop5) ' and tmaxIntersection xtop5/(ntop5)′∩tmaxAs the analysis filter condition of analysis filtration combination model, data filtering analysis is carried out to data set D;
(3) in order to be presented to the user the data filtered by analysis, the result being obtained by filtration will be analyzed by step (1), (2) Data set automatically visualizes;Detailed process is as follows:
(3.1) result data collection is visualized to obtain the base value d (X) of column X, arranges the maximum value max (X) of X, minimum value min (X), the record strip number of X is arranged | X |, arrange the data type type(X of X) and, arrange the counting CNT of each corresponding x ' of case data x ' of X (x '), the related coefficient correlation (x, CNT (x ')) of each case data x ' corresponding counting CNT (x ');
(3.2) the column type type(X according to obtained in (3.1)) define a set of shearing rule;When the data type of column x is timing Type: Visual Chart is histogram, line chart;When the data type of column x is discrete value type or numeric type: Visual Chart is column Shape figure, cake chart, scatter plot;
(3.3) number of results obtained after step (1), (2) analysis filtering is determined using data analysing method-Relative Entropy The visualization how automated according to collection;The core concept of this method is the letter for calculating each data column X and being visualized as various charts Ratio of the entropy relative to standardized chart-information entropy is ceased, C(X is denoted as)1, C(X)2..., C(X)k;Compare each relative information The size of entropy, maximum value C(X)maxCorresponding subtype is exactly the visualization types of data column X;It is specific as follows:
In (3.3.1) column diagram, the difference in height of pillar is for improving user for the identification of data difference;Calculate histogram Relative Entropy uses the base value d (X) of column X, | d (X) | indicate the value of the radix d (X) of column X:
(3.3.2) pie chart can show multi-group data, and performance each group of data accounts for always than situation;In cake chart, discrimination is needed CNT(x ') highlight the accounting of each section, introduce Shannon entropy thus:, as The part of criterion;Wherein y indicates each value of CNT (x'), and P (y) indicates the quantity accounting value of y, i.e. y is CNT's (x') Probability of happening;
(3.3.3) line chart can reflect the case where development and change of the same thing in different time;As data CNT(x ') with X ' meets certain distribution: when linear distribution, exponential distribution, log series model or low order power are distributed, the expression formula of distribution is denoted as Distribution (x ', CNT(x ')), comentropy C(X) it is 1;Otherwise, comentropy C(X) it is 0;
C(X)=distribution (x ', CNT(x '))
In (3.3.4) scatter plot, by reference axis, the relationship between two variables is indicated;Use related coefficient correlation (x ', CNT (x ')) is calculated;
C(X)=correlation (x ', CNT (x '))
(3.4) relative information Entropy sequence is obtained under various Visual Charts by comparing column X, obtain Relative Entropy maximum value
C(X)max;The result data collection obtained after step (1), (2) analysis filtering is using C(X)maxCorresponding subtype carries out Visualization shows.
CN201910401717.XA 2019-05-14 2019-05-14 Modeling method for data filtering rule in data analysis Active CN110175191B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910401717.XA CN110175191B (en) 2019-05-14 2019-05-14 Modeling method for data filtering rule in data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910401717.XA CN110175191B (en) 2019-05-14 2019-05-14 Modeling method for data filtering rule in data analysis

Publications (2)

Publication Number Publication Date
CN110175191A true CN110175191A (en) 2019-08-27
CN110175191B CN110175191B (en) 2023-06-27

Family

ID=67691033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910401717.XA Active CN110175191B (en) 2019-05-14 2019-05-14 Modeling method for data filtering rule in data analysis

Country Status (1)

Country Link
CN (1) CN110175191B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766167A (en) * 2019-10-29 2020-02-07 深圳前海微众银行股份有限公司 Interactive feature selection method, device and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550374A (en) * 2016-01-29 2016-05-04 湖南大学 Random forest parallelization machine studying method for big data in Spark cloud service environment
CN106295983A (en) * 2016-08-08 2017-01-04 烟台海颐软件股份有限公司 Power marketing data visualization statistical analysis technique and system
CN106599325A (en) * 2017-01-18 2017-04-26 河海大学 Method for constructing data mining visualization platform based on R and HighCharts
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107193967A (en) * 2017-05-25 2017-09-22 南开大学 A kind of multi-source heterogeneous industry field big data handles full link solution
CN108171617A (en) * 2017-12-08 2018-06-15 全球能源互联网研究院有限公司 A kind of power grid big data analysis method and device
CN109409647A (en) * 2018-09-10 2019-03-01 昆明理工大学 A kind of analysis method of the salary level influence factor based on random forests algorithm

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550374A (en) * 2016-01-29 2016-05-04 湖南大学 Random forest parallelization machine studying method for big data in Spark cloud service environment
CN106295983A (en) * 2016-08-08 2017-01-04 烟台海颐软件股份有限公司 Power marketing data visualization statistical analysis technique and system
CN106599325A (en) * 2017-01-18 2017-04-26 河海大学 Method for constructing data mining visualization platform based on R and HighCharts
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107193967A (en) * 2017-05-25 2017-09-22 南开大学 A kind of multi-source heterogeneous industry field big data handles full link solution
CN108171617A (en) * 2017-12-08 2018-06-15 全球能源互联网研究院有限公司 A kind of power grid big data analysis method and device
CN109409647A (en) * 2018-09-10 2019-03-01 昆明理工大学 A kind of analysis method of the salary level influence factor based on random forests algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
魏正韬: "基于非平衡数据的随机森林算法研究", 信息科技, no. 2018 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766167A (en) * 2019-10-29 2020-02-07 深圳前海微众银行股份有限公司 Interactive feature selection method, device and readable storage medium

Also Published As

Publication number Publication date
CN110175191B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
US8041714B2 (en) Filter chains with associated views for exploring large data sets
Achtert et al. Evaluation of clusterings--metrics and visual support
CN102968436B (en) Chart is recommended
Ko et al. Marketanalyzer: An interactive visual analytics system for analyzing competitive advantage using point of sale data
US8595151B2 (en) Selecting sentiment attributes for visualization
US7446769B2 (en) Tightly-coupled synchronized selection, filtering, and sorting between log tables and log charts
US20090096812A1 (en) Apparatus and method for morphing data visualizations
US20070050237A1 (en) Visual designer for multi-dimensional business logic
US20020099581A1 (en) Computer-implemented dimension engine
Halim et al. Quantifying and optimizing visualization: An evolutionary computing-based approach
Lu et al. Palettailor: Discriminable colorization for categorical data
US20130054510A1 (en) Automated system for preparing and presenting control charts
CN110175191A (en) Data filtering rule modeling method in data analysis
CN110321914A (en) A kind of Oil Quality Analysis managing and control system
Han et al. Rankbrushers: interactive analysis of temporal ranking ensembles
CN105022724A (en) Automatic selection method of statistical symbol on the basis of statistical data and charting requirements
US20080288445A1 (en) Interactive aggregation of data on a scatter plot
US7957932B1 (en) Data analysis systems and related methods
Leite et al. PhenoVis–A tool for visual phenological analysis of digital camera images using chronological percentage maps
US20220100358A1 (en) Intellectual-Property Landscaping Platform
CN114066645A (en) Visible fund data analysis system and method
US20220101463A1 (en) Intellectual-Property Landscaping Platform
US20080294671A1 (en) Exporting aggregated and un-aggregated data
Andrienko et al. Visual Analytics for Understanding Multiple Attributes
Seebacher Visual Analytics of Spatial Events: Methods for the Interactive Analysis of Spatio-Temporal Data Abstractions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant