CN106991101A - A kind of method and apparatus of spreadsheet analysis processing - Google Patents
A kind of method and apparatus of spreadsheet analysis processing Download PDFInfo
- Publication number
- CN106991101A CN106991101A CN201610042109.0A CN201610042109A CN106991101A CN 106991101 A CN106991101 A CN 106991101A CN 201610042109 A CN201610042109 A CN 201610042109A CN 106991101 A CN106991101 A CN 106991101A
- Authority
- CN
- China
- Prior art keywords
- data table
- cost
- conventional data
- parameter
- scanning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
Abstract
The embodiment of the present application provides the method and apparatus that a kind of spreadsheet analysis is handled, and the tables of data includes the conventional data table of data common layer, and, the external data table of non-data common layer, described method includes:Processing cost data are calculated for the conventional data meter of the data common layer;Determine the conventional data table that the external data table of the non-data common layer is relied on;Processing cost data according to the conventional data table, calculate the use cost data of the external data table, so that when the cost of each conventional data table to data common layer is estimated, no longer it is itself storage for considering current data table in isolation, calculate consumption, and several upstream data tables and fraternal tables of data of the tables of data can be considered, so as to reasonable, accurately assess the processing cost of conventional data table, the quality that the data model of data common layer is built is reflected with this, for data common layer model optimization decision support is provided with operation.
Description
Technical field
The application is related to big data processing technology field, more particularly to a kind of spreadsheet analysis processing
Method and a kind of spreadsheet analysis processing device.
Background technology
The arriving in big data epoch, has highlighted mass data storage, calculating, the demand of processing, number
According to association be particularly important with service.The data of these magnanimity are general with structuring or half
The form of structuring is stored in cloud computing cluster, such as:Hadoop, ODPS etc..Mass data
Between relation by a sheet by a sheet tables of data that is stored in cloud computing cluster come tissue and embodiment, and
And exchanging visit, stream are formed between different company, and same in-company different business department
Turn and exchange, so that codes or data due value when really playing big data.
It is general for some conventional data under cloud computing environment in thousands of tables of data
Data, unified processing and conclusion can be carried out, some highly versatiles are formed, durability is high, height
Unified and standard tables of data, composition data common layer.In general, the tables of data of data common layer
It is the data that each business department needs to be commonly used.
It is well known that storage, calculating, management, the maintenance of codes or data are required for consumption during big data
Higher hardware and software cost and human cost, then how the cost consumption that data mart modeling is brought is counted
How amount, and cost consumption required in data use are assessed and are exchanging visits, flowing as data
Turn, exchange during face important and core the problem of.
In prior art, for tables of data processing cost only by being disappeared during data mart modeling
The computational hardware resource (such as CPU consumption, memory consumption) and storage resource of consumption (are deposited
The consumption of storage media) measure, i.e., simply isolated analysis when previous data table is in process
The storage consumption of middle generation and calculating are consumed.Use cost for tables of data also will simply be used
The data mart modeling cost of table shares out equally each user to this tables of data, it is clear that also not public enough
It is flat and reasonable.This by necessarily cause in prior art no matter data mart modeling cost metering or data
The problem of use cost metering is all not accurate enough, so that it is effective to have a strong impact on data in cloud computing environment
Property judgement, cause data cost too high, and, excessive unnecessary resource cost.
The content of the invention
In view of the above problems, it is proposed that the embodiment of the present application so as to provide one kind overcome above mentioned problem or
The method and corresponding one kind for a kind of spreadsheet analysis processing that person solves the above problems at least in part
The device of spreadsheet analysis processing.
In order to solve the above problems, this application discloses a kind of method of spreadsheet analysis processing, institute
Stating tables of data includes the conventional data table of data common layer, and, the external number of non-data common layer
According to table, described method includes:
Processing cost data are calculated for the conventional data meter of the data common layer;
Determine the conventional data table that the external data table of the non-data common layer is relied on;
According to the processing cost data of the conventional data table, the use of the external data table is calculated
Cost data.
Alternatively, the conventional data meter for the data common layer calculates processing cost data
The step of include:
Extract the processing cost characteristic parameter of the conventional data table of the data common layer;
Using the processing cost data of conventional data table described in the processing cost calculation of characteristic parameters.
Alternatively, the processing cost characteristic parameter includes the first scanning cost parameter, the extraction
The sub-step of the processing cost characteristic parameter of the conventional data table of the data common layer is further wrapped
Include:
Count the quantity for the parent table that the conventional data table is relied on;
Obtain scanning amount of the conventional data table to the parent table;
Count the quantity of all sublists under the parent table;
The processing cost number using conventional data table described in the processing cost calculation of characteristic parameters
According to sub-step further comprise:
The parent table quantity relied on using the conventional data table, the conventional data table is to the father
The scanning amount of table, and, the quantity of all sublists under the parent table calculates the first scanning cost
Parameter.
Alternatively, the processing cost characteristic parameter also includes the first calculating cost parameter, and,
First carrying cost parameter, the processing cost of the conventional data table of the extraction data common layer
The sub-step of characteristic parameter further comprises:
The complexity CU of the conventional data table is extracted as the first calculating cost parameter;
The amount of storage of the conventional data table is extracted as the first carrying cost parameter.
Alternatively, the parent table quantity that the conventional data table is relied on is used by equation below, with
And, the conventional data table to the scanning amount of the parent table, and, all sublists under the parent table
Quantity, calculate the first scanning cost parameter:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
Alternatively, general number described in the processing cost calculation of characteristic parameters is used by equation below
According to the processing cost data of table:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
Alternatively, the processing cost data according to the conventional data table, calculate the outside
The step of use cost data of tables of data is,
According to the processing cost characteristic parameter of the conventional data table, the external data table is calculated
Use cost data.
Alternatively, the processing cost characteristic parameter according to the conventional data table, calculates described
The use cost data step of external data table includes:
Extract being processed into for the conventional data table that the external data table of the non-data common layer is relied on
Eigen parameter;
Joined using the use cost feature of external data table described in the processing cost calculation of characteristic parameters
Number;
Using the use cost data of external data table described in the use cost calculation of characteristic parameters.
Alternatively, the use cost characteristic parameter includes the second calculating cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the conventional data table that the external data table is relied on first calculates cost parameter;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter includes:
Obtain the calculating cost calculation between the external data table and its conventional data table relied on
The factor;
Cost parameter is calculated using described in the calculating cost calculation factor correction first, second is obtained
Calculate cost parameter.
Alternatively, the use cost characteristic parameter includes the second carrying cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the first carrying cost parameter of the conventional data table that the external data table is relied on;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter also includes:
The carrying cost obtained between the external data table and its conventional data table relied on is calculated
The factor;
First carrying cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Carrying cost parameter.
Alternatively, the use cost characteristic parameter includes the second scanning cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the first scanning cost parameter of the conventional data table that the external data table is relied on;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter also includes:
Obtain the scanning cost calculation between the external data table and its conventional data table relied on
The factor;
First scanning cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Scan cost parameter.
Alternatively, the calculating between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor further comprises:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
The conventional data table was carried out according to every day in described nearest m days using equation below
The number of the tables of data of scanning, and, the conventional data table average sublist number of nearest m days,
Calculate the cost calculation factor:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
Alternatively, the storage between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor further comprises:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
The scanning of the conventional data table relied on using equation below according to the external data table it
Amount, and, there are k tables of dependence with the conventional data table, calculate carrying cost
Calculate the factor:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
Alternatively, the scanning between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor further comprises:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer, the temperature field is the quilt in certain time period
The number of times used is more than the field of the direct downstream data table quantity of the conventional data table;
Using equation below according to the ratio shared by temperature field in the conventional data table, and,
Level of the conventional data table in current data common layer, calculates the scanning cost calculation factor:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
Alternatively, external number described in the use cost calculation of characteristic parameters is used by equation below
According to the use cost data of table:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
Alternatively, described method also includes:
When the processing cost data meet the first preparatory condition, corresponding conventional data table is extracted.
Alternatively, it is described when the processing cost data meet the first preparatory condition, extract correspondence
Conventional data table the step of include:
If the first carrying cost parameter of certain conventional data table and the first ratio for calculating cost parameter
Higher than the first predetermined threshold value, then the conventional data table is extracted;
And/or,
If the first of certain conventional data table, which calculates cost parameter, is higher than the second predetermined threshold value, extract
Go out the conventional data table;
And/or,
If the ratio of the first scanning cost parameter of certain conventional data table and the first calculating cost parameter
Higher than the 3rd predetermined threshold value, then the conventional data table is extracted;
And/or,
The presence of statistics and certain conventional data table directly relies on the second meter of the external data table of relation
Calculate cost parameter sum;
If the first of the conventional data table, which calculates cost parameter, is more than the described second calculating cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second depositing for the external data table of relation
Store up cost parameter sum;
If the first carrying cost parameter of the conventional data table is more than the second carrying cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second sweeping for the external data table of relation
Retouch cost parameter sum;
If the first scanning cost parameter of the conventional data table is more than the described second scanning cost parameter
Sum, then extract the conventional data table.
Alternatively, described method also includes:
When the use cost data meet the second preparatory condition, corresponding external data table is extracted.
Alternatively, it is described when the processing cost data meet the second preparatory condition, extract correspondence
External data table the step of include:
If the second carrying cost parameter of certain external data table and the second ratio for calculating cost parameter
Higher than the 4th predetermined threshold value, then the external data table is extracted;
And/or,
If certain external data table can be obtained and current conventional data table phase from other conventional data tables
Same data, and the second scanning cost parameter when obtaining data by other conventional data tables is small
The second scanning cost parameter when data are obtained from current conventional data table, then extract described outer
Portion's tables of data.
In order to solve the above problems, disclosed herein as well is a kind of device of spreadsheet analysis processing,
Characterized in that, the tables of data includes the conventional data table of data common layer, and, non-data
The external data table of common layer, described device includes:
Processing cost computing module, calculates for the conventional data meter for the data common layer and adds
Work cost data;
Determining module, for determining that it is general that the external data table of the non-data common layer is relied on
Tables of data;
Use cost computing module, for the processing cost data according to the conventional data table, meter
Calculate the use cost data of the external data table.
Alternatively, the processing cost computing module includes:
Processing cost characteristic parameter extraction submodule, the general number for extracting the data common layer
According to the processing cost characteristic parameter of table;
Processing cost calculating sub module, for leading to using described in the processing cost calculation of characteristic parameters
With the processing cost data of tables of data.
Alternatively, the processing cost characteristic parameter includes the first scanning cost parameter, the processing
Cost feature parameter extraction submodule further comprises:
Parent table quantity statistics unit, the quantity for counting the parent table that the conventional data table is relied on;
Scanning amount acquiring unit, for obtaining scanning amount of the conventional data table to the parent table;
Sublist quantity statistics unit, the quantity for counting all sublists under the parent table;
The processing cost calculating sub module further comprises:
First scanning cost computing unit, for the parent table number relied on using the conventional data table
Amount, the conventional data table to the scanning amount of the parent table, and, all sublists under the parent table
Quantity, calculate the first scanning cost parameter.
Alternatively, the processing cost characteristic parameter also includes the first calculating cost parameter, and,
First carrying cost parameter, the processing cost characteristic parameter extraction submodule further comprises:
First calculates cost parameter extraction unit, the complexity CU for extracting the conventional data table
Cost parameter is calculated as first;
First carrying cost parameter extraction unit, the amount of storage for extracting the conventional data table is made
For the first carrying cost parameter.
Alternatively, the parent table quantity that the conventional data table is relied on is used by equation below, with
And, the conventional data table to the scanning amount of the parent table, and, all sublists under the parent table
Quantity, calculate the first scanning cost parameter:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
Alternatively, general number described in the processing cost calculation of characteristic parameters is used by equation below
According to the processing cost data of table:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
Alternatively, the use cost computing module includes:
Use cost calculating sub module, for the processing cost feature ginseng according to the conventional data table
Number, calculates the use cost data of the external data table.
Alternatively, the use cost calculating sub module includes:
Processing cost characteristic parameter extraction unit, the external number for extracting the non-data common layer
The processing cost characteristic parameter of the conventional data table relied on according to table;
Use cost calculation of characteristic parameters unit, for using the processing cost calculation of characteristic parameters
The use cost characteristic parameter of the external data table;
Use cost Data Computation Unit, for using described in the use cost calculation of characteristic parameters
The use cost data of external data table.
Alternatively, the use cost characteristic parameter includes the second calculating cost parameter;
The processing cost characteristic parameter extraction unit includes:
First calculating cost parameter shifts to an earlier date subelement, is relied on for extracting the external data table
The first of conventional data table calculates cost parameter;
The use cost calculation of characteristic parameters unit includes:
Calculate the cost calculation factor and obtain subelement, for obtain the external data table and its institute according to
The calculating cost calculation factor between bad conventional data table;
Second calculates cost parameter computation unit, for using the calculating cost calculation factor school
Just described first calculates cost parameter, obtains second and calculates cost parameter.
Alternatively, the use cost characteristic parameter includes the second carrying cost parameter;
The processing cost characteristic parameter extraction unit includes:
First carrying cost parameter extraction subelement, is relied on for extracting the external data table
First carrying cost parameter of conventional data table;
The use cost calculation of characteristic parameters unit also includes:
Carrying cost calculate the factor obtain subelement, for obtain the external data table with its institute according to
Carrying cost between bad conventional data table calculates the factor;
Second carrying cost parameter computation unit, for calculating factor school using the carrying cost
Just described first carrying cost parameter, obtains the second carrying cost parameter.
Alternatively, the use cost characteristic parameter includes the second scanning cost parameter;
The processing cost characteristic parameter extraction unit includes:
First scanning cost parameter extraction subelement, is relied on for extracting the external data table
First scanning cost parameter of conventional data table;
The use cost calculation of characteristic parameters unit also includes:
Scan the cost calculation factor and obtain subelement, for obtain the external data table and its institute according to
The scanning cost calculation factor between bad conventional data table;
Second scanning cost parameter computation unit, for calculating factor school using the carrying cost
Just described first scanning cost parameter, obtains second and scans cost parameter.
Alternatively, the calculating cost calculation factor obtains subelement and is further used for:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
The conventional data table was carried out according to every day in described nearest m days using equation below
The number of the tables of data of scanning, and, the conventional data table average sublist number of nearest m days,
Calculate the cost calculation factor:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
Alternatively, the carrying cost calculates factor acquisition subelement and is further used for:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
The scanning of the conventional data table relied on using equation below according to the external data table it
Amount, and, there are k tables of dependence with the conventional data table, calculate carrying cost
Calculate the factor:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
Alternatively, the scanning cost calculation factor obtains subelement and is further used for:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer, the temperature field is the quilt in certain time period
The number of times used is more than the field of the direct downstream data table quantity of the conventional data table;
Using equation below according to the ratio shared by temperature field in the conventional data table, and,
Level of the conventional data table in current data common layer, calculates the scanning cost calculation factor:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
Alternatively, external number described in the use cost calculation of characteristic parameters is used by equation below
According to the use cost data of table:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
Alternatively, described device also includes:
First extraction module, for when the processing cost data meet the first preparatory condition, carrying
Take corresponding conventional data table.
Alternatively, first extraction module includes:
First extracting sub-module, for the first carrying cost parameter in certain conventional data table and the
When one ratio for calculating cost parameter is higher than the first predetermined threshold value, the conventional data table is extracted;
And/or,
Second extracting sub-module, for being higher than in the first calculating cost parameter of certain conventional data table
During the second predetermined threshold value, the conventional data table is extracted;
And/or,
3rd extracting sub-module, for the first scanning cost parameter in certain conventional data table and the
When one ratio for calculating cost parameter is higher than three predetermined threshold values, the conventional data table is extracted;
And/or,
4th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table second calculate cost parameter sum;
4th extracting sub-module, is more than for the first calculating cost parameter in the conventional data table
During the second calculating cost parameter sum, the conventional data table is extracted;
And/or,
5th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table the second carrying cost parameter sum;
5th extracting sub-module, is more than for the first carrying cost parameter in the conventional data table
During the second carrying cost parameter sum, the conventional data table is extracted;
And/or,
6th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table second scanning cost parameter sum;
6th extracting sub-module, is more than for the first scanning cost parameter in the conventional data table
During the second scanning cost parameter sum, the conventional data table is extracted.
Alternatively, described device also includes:
Second extraction module, for when the use cost data meet the second preparatory condition, carrying
Take corresponding external data table.
Alternatively, second extraction module includes:
7th extracting sub-module, for the second carrying cost parameter in certain external data table and the
When two ratios for calculating cost parameter are higher than four predetermined threshold values, the external data table is extracted;
And/or,
8th extracting sub-module, for that can be obtained in certain external data table from other conventional data tables
With current conventional data table identical data, and by other conventional data tables obtain data when
Second scanning cost parameter be less than from current conventional data table obtain data when second scanning cost
During parameter, the external data table is extracted.
Compared with background technology, the embodiment of the present application includes advantages below:
First, in the embodiment of the present application, by considering the dependence between different pieces of information table,
Scanning cost parameter is introduced, the assessment of cost mode of tables of data is optimized so that public to data
No longer it is to consider current number in isolation when the cost of each conventional data table of co-layer is estimated
According to itself storage of table, consumption is calculated, and several upstream numbers of the tables of data can be considered
According to table and fraternal tables of data, being processed into for conventional data table is assessed so as to reasonable, accurate
This, reflects the quality that the data model of data common layer is built with this, is the public layer model of data
Optimization provides decision support with operation.
Second, in the embodiment of the present application, the use cost for external data table is measured, can be with
Clearly evaluate that the conventional data tables of other external data table access data common layers brought deposits
Storage, the consumption for calculating, scanning three parts, are easy to assessment external data table access data common layer to lead to
With the reasonability and necessity of tables of data, thus auxiliary activities department its data table is built it is excellent
Change, it is to avoid the wasting of resources that Data duplication construction is caused, lifting data resource utilization rate, reduction number
According to cost, so as to reach purpose cost-effective on the whole.
3rd, in the embodiment of the present application, also calculate the factor by introducing so that upstream data table
Cost consumption can be inherited according to rational ratio by downstream data table, meanwhile, by comprehensive
Close and consider amount of storage, scanning amount, the extent for multiplexing of tables of data, the processing level and number of tables of data
The factors such as the temperature field ratio according to table so that the use cost of outside tables of data is calculated more rationally,
It is more accurate.
4th, the application asks embodiment by the processing cost data to conventional data table, and outside
The use cost data of portion's tables of data are analyzed, and it is compared with predetermined threshold value, so as to
It is enough specifically to identify the too high tables of data of cost consumption, help to optimize the tables of data,
Further to reach cost-effective purpose.
Brief description of the drawings
Fig. 1 is a kind of step flow chart of the embodiment of the method one of spreadsheet analysis processing of the application;
Fig. 2 is a kind of conventional data table model schematic diagram of data common layer of the application;
Fig. 3 is a kind of conventional data table and external data table relation schematic diagram of the application;
Fig. 4 is a kind of step flow chart of the embodiment of the method two of spreadsheet analysis processing of the application;
Fig. 5 is another conventional data table and external data table relation schematic diagram of the application;
Fig. 6 is a kind of structured flowchart of the device embodiment of spreadsheet analysis processing of the application.
Embodiment
To enable above-mentioned purpose, the feature and advantage of the application more obvious understandable, with reference to
The drawings and specific embodiments are described in further detail to the application.
In prior art, for tables of data processing cost only by being disappeared during data mart modeling
The computational hardware resource (such as CPU consumption, memory consumption) and storage resource of consumption (are deposited
The consumption of storage media) measure.However, the generation of a tables of data, data therein may be come
Need to rely on upstream N numbers from the generation of the N tables of data in upstream, that is, a tables of data
According to table.And existing cost measurement model is isolated analysis when previous data table is processed
The storage consumption and calculating produced in journey is consumed, not in view of the dependence between tables of data,
Therefore it have ignored the scanning consumption between tables of data.
Prior art also will simply be used the data mart modeling cost of table for the use cost of tables of data
Each user to this tables of data is shared out equally, without the specific visit according to each user
Ask that situation is different, take different allocation way.Because different users is to same tables of data
Service condition is different, and the data volume that some users access calculates also more complicated than larger,
Some users only have read a small amount of data, calculate very simple.If by the way of sharing equally,
The scanning cost that so the two users are undertaken is identical, but this be clearly it is unfair,
It is irrational.
In view of the above-mentioned problems, the application creatively propose for carry out spreadsheet analysis processing two
The metering model of metering model, respectively the data mart modeling cost of data common layer is planted, and, outside
Portion data object BU accesses the data use cost metering model of the public layer data of data.
It is simple to the embodiment of the present application below to make those skilled in the art more fully understand the application
The core idea brief description of two kinds of involved metering models:
First, the metering model of data common layer data mart modeling cost:Including calculating assessment of cost, depositing
Store up three parts of assessment of cost and scanning assessment of cost.Calculate assessment of cost and carrying cost is assessed
It is the angle from the conventional data table itself, reflects this tables of data during data mart modeling
Actual software and hardware consumption.And cost is scanned, it is since it is considered that tables of data during data mart modeling
Dependence, scan cost calculating according to sublist the scanning amount of parent table is accounted for parent table totally swept
The ratio for the amount of retouching shares parent table cost, is used as scanning cost of the sublist to parent table.
2nd, external data object BU accesses the data use cost metering mould of the public layer data of data
Type:According to the metering method of data mart modeling cost consumption, it can obtain being used three of tables of data
It is divided into this, that is, calculates cost, carrying cost, scanning cost.For this tables of data use into
This, can calculate this three departmental cost in the way of corresponding proportion shares rear weight summation.Three
The amortization ratio algorithm of departmental cost can be with different.
Above two metering model is applied in actual Data Analysis Services, can at least solved
Following technical problem:
1) obtain a data common layer tables of data carrying cost, calculate cost, scanning cost tripartite
The ratio in face;
2) when carrying cost is higher than some threshold value, amount of storage can be reduced;
3) when calculating cost higher than some threshold value, the calculating logic of this tables of data can be optimized,
Reduce amount of calculation;
4) when scanning cost higher than some threshold value, the processing links of this tables of data can be optimized,
Reduce the useless scan data volume to parent table;
5) control data user, only reads necessary data volume from common layer, reduces hash
Scanning;
6) control data user, as far as possible using the deeper table of level (the deeper table of level be all through
The table of common layer deep processing is crossed, is fine work table).
Data mart modeling cost generally, for each tables of data of data common layer is less than
The data use cost sum in the direct downstream of the table, this tables of data just meets data common layer
It is required that, just there is the value for being present in data common layer.
Reference picture 1, shows a kind of step of the embodiment of the method one of spreadsheet analysis processing of the application
Rapid flow chart, wherein, the tables of data can include the conventional data table of data common layer, and,
The external data table of non-data common layer, described method specifically may include steps of:
Step 101, processing cost data are calculated for the conventional data meter of the data common layer;
In the embodiment of the present application, the processing cost data of conventional data table can not only be included in pair
Tables of data carry out data mart modeling during consumed computational hardware resource (such as CPU consumption,
Memory consumption) and storage resource (consumption of storage medium), can also be including between tables of data
Scanning consumption between dependence, i.e. tables of data.
The generation of one tables of data, data therein may be from the N tables of data in upstream, because
This, between tables of data scanning consumption embody be during being processed to tables of data, can
Can use to the scanning amount of the tables of data relied on.Reference picture 2, shows that a kind of data are public
The conventional data table model schematic diagram of co-layer, each circle A, B, C, D, E, F difference table
Registration is according to 6 conventional data tables of common layer, and the arrow in figure between two circles represents that two lead to
With the data exchanging visit relation existed between tables of data, that is, scan relation.For example, conventional data
Arrow between table B and conventional data Table A represents that conventional data table B needs to scan conventional data table
The size of digitized representation scanning amount on A, arrow, unit is conventional data in TB, therefore Fig. 2
The data that table B needs to scan conventional data Table A are 2TB.
In a preferred embodiment of the present application, the general number for the data common layer
Following sub-step can specifically be included by calculating processing cost data according to meter:
Sub-step 1011, extracts the processing cost feature ginseng of the conventional data table of the data common layer
Number;
Sub-step 1012, using adding for conventional data table described in the processing cost calculation of characteristic parameters
Work cost data.
In a kind of embodiment of the application, the processing cost characteristic parameter can include the first meter
Calculate cost parameter, and, the first carrying cost parameter, the extraction data common layer it is logical
It may further include with the sub-step of the processing cost characteristic parameter of tables of data:
The complexity CU of the conventional data table is extracted as the first calculating cost parameter;
The amount of storage of the conventional data table is extracted as the first carrying cost parameter.
In the embodiment of the present application, the first calculating cost parameter can be that the conventional data table is entering
The cpu resource expended is needed during row data mart modeling, can be calculated with complexity CU, 1CU
Represent the cost consumption required for 1 CPU (core) is run one day.Complexity CU can be from opening
Put data processing service ODPS (Open Data Processing Service, abbreviation ODPS) cluster
Obtained in metadata.ODPS is a kind of large-scale distributed data processing service, can be supported to sea
Amount data are handled.
First carrying cost parameter can be the required consumption when being stored to the conventional data table
The hard-disc storage resource taken, can be calculated, 1TU represents 1TB data storages one with amount of storage TU
Cost consumption required for it.Amount of storage TU can also be obtained from ODPS cluster metadata.
In the embodiment of the present application, in order to by the complexity in units of CU and in units of TU
Amount of storage carries out unified, comprehensive metering, can introduce new resource consumption measurement unit, i.e. resource
Unit, is designated as CT.Wherein, the conversion relation between resource unit and complexity CU is:
1CT=4CU;Conversion relation between resource unit and amount of storage TU is:1CT=9TU.
If for example, the complexity of one conventional data table consumption of processing is 1CU, the amount of storage of consumption
For 2TU, then the resource that the conventional data table is consumed in process is 1/4+2/9=0.47
CT。
In another embodiment of the application, the processing cost characteristic parameter can also include the
One scan cost parameter, the processing cost of the conventional data table of the extraction data common layer is special
The sub-step for levying parameter may further include:
Count the quantity for the parent table that the conventional data table is relied on;
Obtain scanning amount of the conventional data table to the parent table;
Count the quantity of all sublists under the parent table;
The processing cost number using conventional data table described in the processing cost calculation of characteristic parameters
According to sub-step may further include:
The parent table quantity relied on using the conventional data table, the conventional data table is to the father
The scanning amount of table, and, the quantity of all sublists under the parent table calculates the first scanning cost
Parameter.
For example, referring to shown in Fig. 2, the arrow generation between conventional data table C and conventional data Table A
Table conventional data table C needs to scan conventional data Table A, i.e. conventional data Table A is conventional data
Digitized representation sublist C on table C parent table, arrow is 1TB to the size of parent table A scanning amount,
Sublist quantity under parent table A totally 3, i.e. conventional data table B, conventional data table C and general
Tables of data D.Using above-mentioned data, so as to calculate the first scanning cost parameter.
In the specific implementation, the first scanning cost parameter can be calculated by equation below:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
In a preferred embodiment of the present application, cost parameter, first can be calculated using first
Carrying cost parameter, and, the first scanning cost parameter calculates being processed into for the conventional data table
Notebook data.
In the specific implementation, the processing cost of the conventional data table can be calculated by equation below
Data:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
Therefore, the processing cost data of each conventional data table can be calculated as follows in Fig. 2:
Conventional data Table A:2/9+1/4+0=0.472CT
Conventional data table B:1/9+2/4+0.472* (2/ (2+1+1))=0.845CT
Conventional data table C:2/9+2/4/0.472* (1/ (2+1+1))=0.840CT
Conventional data table D:1/9+1/4+0.472* (1/ (2+1+1))=0.479CT
Conventional data table E:0.5/9+3/4+0.854*2/2+0.840* (1/ (1+5))=1.800CT
Conventional data table F:1/9+3/4+0.840* (5/ (1+5))=1.561CT
Above example is only to help the understanding to the embodiment of the present application, be should not be understood as to the application
Restriction.Those skilled in the art can be according to the reality between each conventional data table in data common layer
Border dependence, using the method and formula described in the embodiment of the present application, obtains corresponding processing
Cost data.
Step 102, the conventional data that the external data table of the non-data common layer is relied on is determined
Table;
In the embodiment of the present application, can be true first for the external data table of non-data common layer
Make the conventional data table that the external data table is relied on.Reference picture 3, shows a kind of general number
According to table and external data table relation schematic diagram, Table A in Fig. 3, table B, table C represents data respectively
The conventional data table of common layer, table D then represents an external data table of non-data common layer.Outside
Portion tables of data D can be with accessing universal table B and conventional data table C.In each conventional data table
4 numerals in circle represent the first calculating cost parameter of the conventional data table, first respectively
Carrying cost parameter, the first scanning cost parameter, and total memory data output.
For example, referring to Fig. 3, the first of conventional data Table A calculates cost parameter for 1CT, and first
Carrying cost parameter is 2CT, and the first scanning cost parameter is 2CT, the data of conventional data Table A
Amount of storage is 10TB.External data table D represents external number with the numeral on conventional data table B arrows
It is 2TB according to the table D data volumes for scanning conventional data table B.
Above example is only a kind of example of conventional data table and external data table relation, be should not be understood
To be the restriction to the application, those skilled in the art can be according to actual conditions, using the application
Method described in embodiment, determines the actual dependence between external data table and conventional data table
Relation and data scanning situation.
Step 103, the processing cost data according to the conventional data table, calculate the external data
The use cost data of table;
In the embodiment of the present application, because external data table has the dependence between conventional data table
Relation, therefore, it can the processing cost data according to the conventional data table, calculate the outside
The use cost data of tables of data.Specifically, processing cost that can be according to the conventional data table
Characteristic parameter, calculates the use cost data of the external data table.
In a preferred embodiment of the present application, being processed into according to the conventional data table
The step of eigen parameter, use cost data for calculating the external data table, can specifically include:
Extract being processed into for the conventional data table that the external data table of the non-data common layer is relied on
Eigen parameter;
Joined using the use cost feature of external data table described in the processing cost calculation of characteristic parameters
Number;
Using the use cost data of external data table described in the use cost calculation of characteristic parameters.
In the specific implementation, being relied on when the external data table that determine the non-data common layer
After conventional data table, the machining feature parameter of the conventional data table can be extracted, so that according to
Dependence between the external data table and the conventional data table, calculates the external number
According to the use cost characteristic parameter of table, and then obtain the use cost data of the external data table.
Further, the use cost characteristic parameter can include the second calculating cost parameter, the
Two carrying cost parameters, and, the second scanning cost parameter.
Second calculating cost parameter can be the external data table using the general of data common layer
The cpu resource expended required for during tables of data, can equally be calculated with complexity CU;
Second carrying cost parameter can be the hard-disc storage expended required for being stored to conventional data table
Resource, can be calculated with amount of storage TU;Second scanning cost parameter can then embody external data table
Scanning relation between the conventional data table of data common layer.
In a preferred embodiment of the present application, methods described can further include step 104
With step 105.
Step 104, when the processing cost data meet the first preparatory condition, extract corresponding logical
Use tables of data;
Step 105, when the use cost data meet the second preparatory condition, extract corresponding outer
Portion's tables of data.
In the specific implementation, when the processing cost data for obtaining the conventional data table, and, outside
, can be by the processing cost data and the use cost after the use cost data of portion's tables of data
Data are compared with the first preparatory condition and the second preparatory condition respectively, to determine whether to meet phase
The preparatory condition answered, if so, corresponding conventional data table can be then extracted, or, external data table.
For example, for the conventional data table of data common layer, cost ginseng is calculated obtaining first respectively
After number, the first carrying cost parameter and the first scanning cost parameter, the first calculating can be judged respectively
Whether cost parameter, the first carrying cost parameter and the first scanning cost parameter meet default condition.
If the first carrying cost parameter is too high, it is contemplated that reducing amount of storage for the conventional data table;
If the first calculating cost parameter is higher, the calculating logic of the conventional data table can be optimized, subtracted
Few computation complexity;If the first scanning cost parameter is higher, can be to the conventional data table
Processing links are optimized, to reduce the useless scan data volume to parent table.
And for the external data table of non-data common layer, then can be according to the use cost number of acquisition
According to data user is urged, necessary data volume only is read from data common layer, hash is reduced
Scanning, and, as far as possible using the deeper conventional data table of level, because deeper general of level
Tables of data is all the table by data common layer deep processing, is fine work table.
In the embodiment of the present application, by considering the dependence between different pieces of information table, introduce
Cost parameter is scanned, the assessment of cost mode of tables of data is optimized so as to data common layer
No longer it is to consider current data table in isolation when the cost of each conventional data table is estimated
Itself storage, calculate consumption, and can consider several upstream data tables of the tables of data with
And fraternal tables of data, so as to reasonable, the accurate processing cost for assessing conventional data table, with
This reflects the quality that the data model of data common layer is built, be data common layer model optimization with
Operation provides decision support.
Secondly, in the embodiment of the present application, the use cost for external data table is measured, can be with
Clearly evaluate that the conventional data tables of other external data table access data common layers brought deposits
Storage, the consumption for calculating, scanning three parts, are easy to assessment external data table access data common layer to lead to
With the reasonability and necessity of tables of data, thus auxiliary activities department its data table is built it is excellent
Change, it is to avoid the wasting of resources that Data duplication construction is caused, lifting data resource utilization rate, reduction number
According to cost, so as to reach purpose cost-effective on the whole.
Reference picture 4, shows a kind of step of the embodiment of the method two of spreadsheet analysis processing of the application
Rapid flow chart, wherein, the tables of data can include the conventional data table of data common layer, and,
The external data table of non-data common layer, described method specifically may include steps of:
Step 201, the processing cost characteristic parameter of the conventional data table of the data common layer is extracted;
In the embodiment of the present application, the processing cost characteristic parameter of the conventional data table can include
First calculating cost parameter, the first carrying cost parameter, and, the first scanning cost parameter.
First calculating cost parameter can be the conventional data table during data mart modeling is carried out
The cpu resource expended is needed, is calculated with complexity CU;First carrying cost parameter can be
The hard-disc storage resource of required consuming when being stored to the conventional data table, with amount of storage TU
Calculate;First scanning cost parameter then embodies the conventional data table to associated conventional data
The scanning amount situation of table, the parent table quantity that can be relied on according to the conventional data table is described logical
With scanning amount of the tables of data to the parent table, and, the quantity of all sublists is calculated under the parent table
Obtain.
In the embodiment of the present application, in order to by the complexity in units of CU and in units of TU
Amount of storage carries out unified, comprehensive metering, can introduce new resource consumption measurement unit, i.e. resource
Unit, is designated as CT.Conversion relation between resource unit and complexity CU, amount of storage TU can
Think:1CT=4CU, 1CT=9TU.
Step 202, being processed into using conventional data table described in the processing cost calculation of characteristic parameters
Notebook data;
In the specific implementation, the processing cost of the conventional data table can be calculated by equation below
Data:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
Step 203, the conventional data that the external data table of the non-data common layer is relied on is determined
Table;
For example, referring to shown in Fig. 3, it is general that the external data table D of non-data common layer is relied on
Tables of data includes conventional data table B and conventional data table C.
Step 204, the conventional data table that the external data table of the non-data common layer is relied on is extracted
Processing cost characteristic parameter;
In a kind of embodiment of the application, the use cost characteristic parameter can include the second meter
Calculate cost parameter;Therefore, the external data table for extracting the non-data common layer is relied on
The sub-step of the processing cost characteristic parameter of conventional data table can be:Extract the external data table
The first of the conventional data table relied on calculates cost parameter.
In another embodiment of the application, the use cost characteristic parameter can also include the
Two carrying cost parameters;Therefore, the external data table institute for extracting the non-data common layer according to
The sub-step of the processing cost characteristic parameter of bad conventional data table can also be:Extract the outside
First carrying cost parameter of the conventional data table that tables of data is relied on.
In another embodiment of the application, the use cost characteristic parameter can also include the
Two scanning cost parameters;Therefore, the external data table institute for extracting the non-data common layer according to
The sub-step of the processing cost characteristic parameter of bad conventional data table can be:Extract the external number
First scanning cost parameter of the conventional data table relied on according to table.
For example, referring to shown in Fig. 3, the conventional data table that external data table is relied on is conventional data
Table B and conventional data table C, calculates cost parameter for second, conventional data can be extracted respectively
The first of table B and conventional data table C calculates cost parameter, conventional data table B and conventional data table
It is 1CT that the first of C, which calculates cost parameter,;For the second carrying cost parameter, it can extract respectively
Conventional data table B and conventional data table C the second carrying cost parameter, the of conventional data table B
Two carrying cost parameters are 1CT, and conventional data table C the second carrying cost parameter is 4CT;Pin
To the second scanning cost parameter, the of conventional data table B and conventional data table C can be extracted respectively
Two scanning cost parameters, conventional data table B the second scanning cost parameter is 3CT, conventional data
Table C the second scanning cost parameter is 2CT.
Above example is only to help the understanding to the embodiment of the present application, is not considered as to the application's
Limit, those skilled in the art can be according to actual conditions, described in the embodiment of the present application
Method, obtains corresponding result.
Step 205, using external data table described in the processing cost calculation of characteristic parameters use into
Eigen parameter;
It is described to use the processing cost calculation of characteristic parameters institute in a kind of embodiment of the application
The step of use cost characteristic parameter for stating external data table, can include;
Obtain the calculating cost calculation between the external data table and its conventional data table relied on
The factor;
Cost parameter is calculated using described in the calculating cost calculation factor correction first, second is obtained
Calculate cost parameter.
For same conventional data table, it may be made by multiple different external data tables
With different users is different to the service condition of same conventional data table, some uses
The data volume that person accesses calculates also more complicated, some users only have read a small amount of than larger
Data, are calculated very simple.If by the way of sharing equally, then what the two users were undertaken
Cost is identical, but this is clearly unfair, irrational.Therefore, implement in the application
In example, the calculating cost calculation factor is introduced, by using the calculating cost calculation factor correction
Described first calculates cost parameter, so as to obtain the second calculating cost parameter.Calculate the specific body of the factor
The outside is showed using table during using conventional data table, use feelings of the sublist to parent table
Condition accounts for the overall ratio by service condition of parent table.
Specifically, the calculating between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor may further include:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
It is for instance possible to use equation below, calculates the cost calculation factor, so as to obtain the second meter
Calculate cost parameter:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
It is described to use the processing cost calculation of characteristic parameters in another embodiment of the application
The step of use cost characteristic parameter of the external data table, can also include;
The carrying cost obtained between the external data table and its conventional data table relied on is calculated
The factor;
First carrying cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Carrying cost parameter.
It is similar with the calculating that second calculates cost parameter, can also for the second carrying cost parameter
The mode of the first carrying cost parameter described in factor correction is calculated by using carrying cost, to obtain
Second carrying cost parameter.
Specifically, the storage between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor may further include:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
Equation below can be used, carrying cost is calculated and calculates the factor, so as to obtain the second storage
Cost parameter:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
It is described to use the processing cost calculation of characteristic parameters in another embodiment of the application
The step of use cost characteristic parameter of the external data table, can also include;
Obtain the scanning cost calculation between the external data table and its conventional data table relied on
The factor;
First scanning cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Scan cost parameter.
Similarly, can also be by obtaining scanning cost for the acquisition of the second scanning cost parameter
The factor is calculated, determines that sublist accounts for the parent table totally scanned ratio measured to the scanning amount of parent table, uses
The ratio adjustment first scans cost parameter, so as to obtain the second scanning cost parameter.
Specifically, the scanning between the external data table and its conventional data table relied on is obtained
The sub-step of the cost calculation factor may further include:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer;
For any conventional data table, any one field a in table, if the field a
The number of times used in certain time period by downstream data table be more than the conventional data table it is direct under
Swim table number, then the field a is exactly the temperature field of the conventional data table.Therefore, for
The ratio that temperature Field Count in any conventional data table, table accounts for total Field Count in table is temperature word
Duan Suozhan ratio.The period counted typically for temperature field can be based on over one day
Calculate.
What the dependence level of conventional data table embodied is the conventional data table and current data common layer
In dependence between other conventional data tables.Shown in reference picture 3, wrapped altogether in data common layer
Include 3 conventional data tables, i.e. conventional data Table A, conventional data table B and conventional data table C.
If the dependence level of conventional data Table A is 1, conventional data table B and conventional data table C according to
Bad level is 2.
In the specific implementation, can use equation below, calculate scanning the cost calculation factor, from
And obtain second and scan cost parameter:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
Step 206, using external data table described in the use cost calculation of characteristic parameters use into
Notebook data;
In the embodiment of the present application, when obtain the external data table respectively second calculates cost ginseng
After number, the second carrying cost parameter and the second scanning cost parameter, described second can be calculated as
This parameter, the second carrying cost parameter and the second scanning cost parameter are added up, so as to obtain institute
State the use cost data of external data table.
In the specific implementation, the use cost of the external data table can be calculated by equation below
Data:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
Step 207, when the processing cost data meet the first preparatory condition, extract corresponding logical
Use tables of data;
Step 208, when the use cost data meet the second preparatory condition, extract corresponding outer
Portion's tables of data.
In the specific implementation, the processing cost data of the conventional data table ought be obtained respectively, and,
After the use cost data of external data table, according to the processing cost data and described it can use
Cost data, is analyzed the conventional data table and external data table, to determine the need for
Processing is optimized to the tables of data.
It is described when the processing cost data meet first in a preferred embodiment of the present application
During preparatory condition, the step of extracting corresponding conventional data table can include:
If the first carrying cost parameter of certain conventional data table and the first ratio for calculating cost parameter
Higher than the first predetermined threshold value, then the conventional data table is extracted;
And/or,
If the first of certain conventional data table, which calculates cost parameter, is higher than the second predetermined threshold value, extract
Go out the conventional data table;
And/or,
If the ratio of the first scanning cost parameter of certain conventional data table and the first calculating cost parameter
Higher than the 3rd predetermined threshold value, then the conventional data table is extracted;
And/or,
The presence of statistics and certain conventional data table directly relies on the second meter of the external data table of relation
Calculate cost parameter sum;
If the first of the conventional data table, which calculates cost parameter, is more than the described second calculating cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second depositing for the external data table of relation
Store up cost parameter sum;
If the first carrying cost parameter of the conventional data table is more than the second carrying cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second sweeping for the external data table of relation
Retouch cost parameter sum;
If the first scanning cost parameter of the conventional data table is more than the described second scanning cost parameter
Sum, then extract the conventional data table.
If for example, the first carrying cost parameter of conventional data table and the first of the conventional data table
The ratio for calculating cost parameter is more than 1/4, it is believed that the carrying cost of the conventional data table is higher,
The conventional data table can then be extracted, it is considered to reduce amount of storage.
If the first of the conventional data table calculates cost parameter more than 30CU, that is, CPU
Computing has exceeded 30min, then it is contemplated that optimizing the calculating logic of the conventional data table, to reduce
Amount of calculation.
If the ratio of the first scanning cost parameter of the conventional data table and the first calculating cost parameter
More than 10, it is believed that the first scanning cost parameter is higher, then it is contemplated that to the conventional data
The processing links of table are optimized, to reduce the useless scan data volume to parent table.
If in addition, the first of the conventional data table calculates cost parameter more than the conventional data table
All users calculating cost sum, or, the first carrying cost of the conventional data table
Parameter is more than the carrying cost sum of all users of the conventional data table, or, it is described logical
It is more than the scanning of all users of the conventional data table with the first scanning cost parameter of tables of data
Cost sum, then can recognize and extract the conventional data table, with for further processing.
Above example is only to help the understanding to the embodiment of the present application, and those skilled in the art can root
According to actual conditions, it is determined that corresponding predetermined threshold value size, the application is not construed as limiting to this.
It is described when the processing cost data meet the in another preferred embodiment of the present application
During two preparatory conditions, the step of extracting corresponding external data table can include:
If the second carrying cost parameter of certain external data table and the second ratio for calculating cost parameter
Higher than the 4th predetermined threshold value, then the external data table is extracted;
And/or,
If certain external data table can be obtained and current conventional data table phase from other conventional data tables
Same data, and the second scanning cost parameter when obtaining data by other conventional data tables is small
The second scanning cost parameter when data are obtained from current conventional data table, then extract described outer
Portion's tables of data.
If for example, the second carrying cost parameter of the external data table calculates cost parameter with second
Ratio be more than 1/4, it is believed that the carrying cost of the external data table is higher, then can extract
Go out the external data table, it is considered to reduce amount of storage.
If in addition, the data that the external data table is relied on can be obtained from other conventional data tables
, and when the external data table is scanned to the conventional data table, described second is scanned into
Second when this parameter is scanned less than the external data table to current conventional data table is scanned into
This parameter, then it is contemplated that the dependence to the external data table is optimized, swept with reducing
Retouch cost.
Above example is only to help the understanding to the embodiment of the present application, and those skilled in the art can root
According to actual conditions, it is determined that corresponding predetermined threshold value size, the application is not construed as limiting to this.
In the embodiment of the present application, the factor is calculated by introducing so that the cost of upstream data table disappears
Consumption can be inherited according to rational ratio by downstream data table, meanwhile, deposited by considering
Reserves, scanning amount, the extent for multiplexing of tables of data, the processing level of tables of data and the heat of tables of data
Spend the factors such as field ratio so that the use cost to outside tables of data calculates more reasonable, more accurate.
Secondly, the application asks embodiment by the processing cost data to conventional data table, and outside
The use cost data of portion's tables of data are analyzed, and it is compared with predetermined threshold value, so as to
It is enough specifically to identify the too high tables of data of cost consumption, help further to enter the tables of data
Row optimization, to reach cost-effective purpose.
To enable above-mentioned purpose, the feature and advantage of the application more obvious understandable, below with one
Individual complete example is made one to the preferred embodiment of the application and is described in detail.
If having 6 data Table As, B, C, D, E and F, its scanning relation each other is as follows
Shown in table one:
Table one:
In Table 1:Data common layer includes 4 conventional data tables, i.e. conventional data Table A, led to
With tables of data B, conventional data table C and conventional data table D;The external data of not common data Layer
Totally 2, table, i.e. external data table E and external data table F
Wherein, for the first row data in table one, it can be understood as:Conventional data table B's deposits
Reserves are 10TB, the amount of storage of conventional data Table A is 20TB, and conventional data table B-scan is logical
With tables of data A1TB data.Three sublists are had under conventional data Table A.
For the second row data in table one, it can be understood as:Conventional data table C amount of storage is
6TB, conventional data table B amount of storage is 10TB, conventional data table C-scan conventional data table
B 2TB data.Two sublists are had under conventional data table B.
For the fourth line data in table one, it can be understood as:External data table E amount of storage is
12TB, conventional data table C amount of storage are 6TB, and external data table E scans conventional data table
C 2TB data.Four sublists are had under conventional data table C.
According to above-mentioned scanning relation, another that can construct the application as shown in Figure 5 is general
Tables of data and external data table relation schematic diagram.
According to conventional data table processing cost data calculation formula as described below
The conventional data table processing cost data such as following table two can be obtained:
Table two:
Meanwhile, according to external data table use cost data calculation formula as described below
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
The external data table use cost data such as following table three can be obtained:
Table three:
Then by the use cost of the processing cost data of above-mentioned conventional data table, and external data table
Data are compared with default condition, so as to extract as following table four conventional data table and
External data table:
Table four:
Above example is only to help the understanding to herein described method, is not considered as to the application
Restriction, those skilled in the art can be according to the actual dependence between tables of data, according to this
The described method of application and formula, determine the processing cost data of conventional data table, and outside
The use cost data of tables of data, so that according to the processing cost data and use cost data,
Recognize the need for optimizing tables of data.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is all expressed as
A series of combination of actions, but those skilled in the art should know, the embodiment of the present application is not
Limited by described sequence of movement, because according to the embodiment of the present application, some steps can be adopted
Carry out with other orders or simultaneously.Secondly, those skilled in the art should also know, specification
Described in embodiment belong to preferred embodiment, involved action not necessarily the application
Necessary to embodiment.
Reference picture 6, shows a kind of structure of the device embodiment of spreadsheet analysis processing of the application
Block diagram, wherein, the tables of data can include the conventional data table of data common layer, and, it is non-
The external data table of data common layer, described device can specifically include following module:
Processing cost computing module 301, for the conventional data meter calculation for the data common layer
Processing cost data;
Determining module 302, for determining that it is logical that the external data table of the non-data common layer is relied on
Use tables of data;
Use cost computing module 303, for the processing cost data according to the conventional data table,
Calculate the use cost data of the external data table.
In the embodiment of the present application, the processing cost computing module 301 can specifically include as follows
Submodule:
Processing cost characteristic parameter extraction submodule 3011, for extracting the logical of the data common layer
With the processing cost characteristic parameter of tables of data;
Processing cost calculating sub module 3012, for using the processing cost calculation of characteristic parameters institute
State the processing cost data of conventional data table.
In a kind of embodiment of the application, the processing cost characteristic parameter can be swept including first
Retouch cost parameter, the processing cost characteristic parameter extraction submodule 3011 may further include as
Lower unit:
Parent table quantity statistics unit 111A, for counting the parent table that the conventional data table is relied on
Quantity;
Scanning amount acquiring unit 111B, for obtaining scanning of the conventional data table to the parent table
Amount;
Sublist quantity statistics unit 111C, the quantity for counting all sublists under the parent table;
The processing cost calculating sub module 3012 may further include such as lower unit:
First scanning cost computing unit 121A, for the father relied on using the conventional data table
Table quantity, the conventional data table to the scanning amount of the parent table, and, own under the parent table
The quantity of sublist, calculates the first scanning cost parameter.
In another embodiment of the application, the processing cost characteristic parameter can also include the
One calculates cost parameter, and, the first carrying cost parameter, the processing cost characteristic parameter is carried
Submodule 3011 is taken to can further include such as lower unit:
First calculates cost parameter extraction unit 112A, the complexity for extracting the conventional data table
CU, which is spent, as first calculates cost parameter;
First carrying cost parameter extraction unit 113A, the storage for extracting the conventional data table
Amount is used as the first carrying cost parameter.
In the embodiment of the present application, it can be relied on by equation below using the conventional data table
The parent table quantity of connection, and, the conventional data table to the scanning amount of the parent table, and, institute
The quantity of all sublists under parent table is stated, the first scanning cost parameter is calculated:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... .m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
In the embodiment of the present application, the processing cost characteristic parameter can be used by equation below
Calculate the processing cost data of the conventional data table:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
In the embodiment of the present application, the use cost computing module 303 can specifically include as follows
Submodule:
Use cost calculating sub module 3031, it is special for the processing cost according to the conventional data table
Parameter is levied, the use cost data of the external data table are calculated.
In the embodiment of the present application, the use cost calculating sub module 3031 can specifically be included such as
Lower unit:
Processing cost characteristic parameter extraction unit 311, the outside for extracting the non-data common layer
The processing cost characteristic parameter for the conventional data table that tables of data is relied on;
Use cost calculation of characteristic parameters unit 312, by using based on the processing cost characteristic parameter
Calculate the use cost characteristic parameter of the external data table;
Use cost Data Computation Unit 313, for using the use cost calculation of characteristic parameters institute
State the use cost data of external data table.
In the embodiment of the present application, the use cost characteristic parameter includes the second calculating cost parameter;
The processing cost characteristic parameter extraction unit 311 can specifically include following subelement:
First calculating cost parameter shifts to an earlier date subelement 311A, for extract external data table institute according to
The first of bad conventional data table calculates cost parameter;
The use cost calculation of characteristic parameters unit 312 can specifically include following subelement:
Calculate the cost calculation factor and obtain subelement 312A, for obtaining the external data table and its
The calculating cost calculation factor between the conventional data table relied on;
Second calculate cost parameter computation unit 312B, for using it is described calculating cost calculation because
Son correction described first calculates cost parameter, obtains second and calculates cost parameter.
In the embodiment of the present application, the use cost characteristic parameter can also be stored into including second
This parameter;
The processing cost characteristic parameter extraction unit 311 can specifically include following subelement:
First carrying cost parameter extraction subelement 311B, for extract external data table institute according to
First carrying cost parameter of bad conventional data table;
The use cost calculation of characteristic parameters unit 312 can also include following subelement:
Carrying cost calculates the factor and obtains subelement 312C, for obtaining the external data table and its
Carrying cost between the conventional data table relied on calculates the factor;
Second carrying cost parameter computation unit 312D, for using the carrying cost calculate because
Son correction the first carrying cost parameter, obtains the second carrying cost parameter.
In the embodiment of the present application, the use cost characteristic parameter can also be scanned into including second
This parameter;
The processing cost characteristic parameter extraction unit 311 can also include following subelement:
First scanning cost parameter extraction subelement 311C, for extract external data table institute according to
First scanning cost parameter of bad conventional data table;
The use cost calculation of characteristic parameters unit 312 can also include following subelement:
Scan the cost calculation factor and obtain subelement 312E, for obtaining the external data table and its
The scanning cost calculation factor between the conventional data table relied on;
Second scanning cost parameter computation unit 312F, for using the carrying cost calculate because
Son correction the first scanning cost parameter, obtains second and scans cost parameter.
In the embodiment of the present application, the calculating cost calculation factor obtain subelement 312A can be with
It is further used for:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
The conventional data table was carried out according to every day in described nearest m days using equation below
The number of the tables of data of scanning, and, the conventional data table average sublist number of nearest m days,
Calculate the cost calculation factor:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
In the embodiment of the present application, the carrying cost calculate the factor obtain subelement 312C can be with
It is further used for:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
The scanning of the conventional data table relied on using equation below according to the external data table it
Amount, and, there are k tables of dependence with the conventional data table, calculate carrying cost
Calculate the factor:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
In the embodiment of the present application, the scanning cost calculation factor obtain subelement 312E can be with
It is further used for:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer;
Using equation below according to the ratio shared by temperature field in the conventional data table, and,
Level of the conventional data table in current data common layer, calculates the scanning cost calculation factor:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
In the embodiment of the present application, the use cost characteristic parameter can be used by equation below
Calculate the use cost data of the external data table:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
In the embodiment of the present application, described device can also include following module:
First extraction module 304, for the processing cost data meet the first preparatory condition when,
Extract corresponding conventional data table;
In the embodiment of the present application, first extraction module 304 can specifically include following submodule
Block:
First extracting sub-module 3041, for the first carrying cost parameter in certain conventional data table
When being higher than the first predetermined threshold value with the first ratio for calculating cost parameter, the conventional data is extracted
Table;
And/or,
Second extracting sub-module 3042, for calculating cost parameter the first of certain conventional data table
During higher than the second predetermined threshold value, the conventional data table is extracted;
And/or,
3rd extracting sub-module 3043, for the first scanning cost parameter in certain conventional data table
When being higher than three predetermined threshold values with the first ratio for calculating cost parameter, the conventional data is extracted
Table;
And/or,
4th statistic submodule 3044, the presence for counting with certain conventional data table is directly relied on
The second of the external data table of relation calculates cost parameter sum;
4th extracting sub-module 3045, for calculating cost parameter the first of the conventional data table
When calculating cost parameter sum more than described second, the conventional data table is extracted;
And/or,
5th statistic submodule 3046, the presence for counting with certain conventional data table is directly relied on
Second carrying cost parameter sum of the external data table of relation;
5th extracting sub-module 3047, for the first carrying cost parameter in the conventional data table
During more than the second carrying cost parameter sum, the conventional data table is extracted;
And/or,
6th statistic submodule 3048, the presence for counting with certain conventional data table is directly relied on
Second scanning cost parameter sum of the external data table of relation;
6th extracting sub-module 3049, for scanning cost parameter the first of the conventional data table
During more than the described second scanning cost parameter sum, the conventional data table is extracted.
In the embodiment of the present application, described device can also include following module:
Second extraction module 305, for the use cost data meet the second preparatory condition when,
Extract corresponding external data table.
In the embodiment of the present application, second extraction module 305 can specifically include following submodule
Block:
7th extracting sub-module 3051, for the second carrying cost parameter in certain external data table
When being higher than four predetermined threshold values with the second ratio for calculating cost parameter, the external data is extracted
Table;
And/or,
8th extracting sub-module 3052, for can be from other conventional datas in certain external data table
Table is obtained and current conventional data table identical data, and obtains number passing through other conventional data tables
According to when second scanning cost parameter be less than from current conventional data table obtain data when second scanning
During cost parameter, the external data table is extracted.
For device embodiment, because it is substantially similar to embodiment of the method, so description
Fairly simple, the relevent part can refer to the partial explaination of embodiments of method.
Each embodiment in this specification is described by the way of progressive, each embodiment emphasis
What is illustrated is all the difference with other embodiment, identical similar part between each embodiment
Mutually referring to.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present application can be provided as method,
Device or computer program product.Therefore, the embodiment of the present application can using complete hardware embodiment,
The form of embodiment in terms of complete software embodiment or combination software and hardware.Moreover, this Shen
Please embodiment can use in one or more computers for wherein including computer usable program code
It is real in usable storage medium (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form for the computer program product applied.
In a typical configuration, the computer equipment includes one or more processors
(CPU), input/output interface, network interface and internal memory.Internal memory potentially includes computer-readable medium
In volatile memory, the shape such as random access memory (RAM) and/or Nonvolatile memory
Formula, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.Computer-readable medium includes permanent and non-permanent, removable and non-removable media
It can realize that information is stored by any method or technique.Information can be computer-readable instruction,
Data structure, the module of program or other data.The example of the storage medium of computer includes, but
Phase transition internal memory (PRAM), static RAM (SRAM), dynamic random is not limited to deposit
Access to memory (DRAM), other kinds of random access memory (RAM), read-only storage
(ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other in
Deposit technology, read-only optical disc read-only storage (CD-ROM), digital versatile disc (DVD) or other
Optical storage, magnetic cassette tape, tape magnetic rigid disk storage other magnetic storage apparatus or it is any its
His non-transmission medium, the information that can be accessed by a computing device available for storage.According to herein
Define, computer-readable medium does not include the computer readable media (transitory media) of non-standing,
Such as the data-signal and carrier wave of modulation.
The embodiment of the present application be with reference to according to the method for the embodiment of the present application, terminal device (system) and
The flow chart and/or block diagram of computer program product is described.It should be understood that can be by computer journey
Sequence instructs implementation process figure and/or each flow and/or square frame and flow chart in block diagram
And/or the flow in block diagram and/or the combination of square frame.These computer program instructions can be provided
To all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing terminals
The processor of equipment is to produce a machine so that pass through computer or other programmable datas are handled
The instruction of the computing device of terminal device is produced for realizing in one flow of flow chart or multiple streams
The device for the function of being specified in one square frame of journey and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide at computer or other programmable datas
In the computer-readable memory that reason terminal device works in a specific way so that be stored in the calculating
Instruction in machine readable memory, which is produced, includes the manufacture of command device, and the command device is realized
Specified in one flow of flow chart or multiple flows and/or one square frame of block diagram or multiple square frames
Function.
These computer program instructions can also be loaded into computer or other programmable data processing terminals
In equipment so that on computer or other programmable terminal equipments perform series of operation steps with
Computer implemented processing is produced, so that performed on computer or other programmable terminal equipments
Instruction, which is provided, to be used to realize in one flow of flow chart or multiple flows and/or one square frame of block diagram
Or specified in multiple square frames function the step of.
Although having been described for the preferred embodiment of the embodiment of the present application, those skilled in the art
Once knowing basic creative concept, then other change and modification can be made to these embodiments.
So, appended claims are intended to be construed to include preferred embodiment and fall into the embodiment of the present application
Scope has altered and changed.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relation
Term is used merely to make a distinction an entity or operation with another entity or operation, without
It is certain to require or imply between these entities or operation there is any this actual relation or suitable
Sequence.Moreover, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, article or terminal device including a series of key elements are not only
Including those key elements, but also other key elements including being not expressly set out, or also including being
This process, method, article or the intrinsic key element of terminal device.In not more limitations
In the case of, the key element limited by sentence " including one ... ", it is not excluded that including the key element
Also there is other identical element in process, method, article or terminal device.
Above to the method and a kind of tables of data point of a kind of spreadsheet analysis processing provided herein
The device of analysis processing is described in detail, principle of the specific case to the application used herein
And embodiment is set forth, the explanation of above example is only intended to help to understand the application's
Method and its core concept;Simultaneously for those of ordinary skill in the art, according to the application's
Thought, will change in specific embodiments and applications, in summary, this theory
Bright book content should not be construed as the limitation to the application.
Claims (38)
1. a kind of method of spreadsheet analysis processing, it is characterised in that the tables of data includes data
The conventional data table of common layer, and, the external data table of non-data common layer, described method
Including:
Processing cost data are calculated for the conventional data meter of the data common layer;
Determine the conventional data table that the external data table of the non-data common layer is relied on;
According to the processing cost data of the conventional data table, the use of the external data table is calculated
Cost data.
2. according to the method described in claim 1, it is characterised in that described public for the data
The step of conventional data meter of co-layer calculates processing cost data includes:
Extract the processing cost characteristic parameter of the conventional data table of the data common layer;
Using the processing cost data of conventional data table described in the processing cost calculation of characteristic parameters.
3. method according to claim 2, it is characterised in that the processing cost feature ginseng
Number include first scanning cost parameter, the conventional data table of the extraction data common layer add
The sub-step of work cost feature parameter further comprises:
Count the quantity for the parent table that the conventional data table is relied on;
Obtain scanning amount of the conventional data table to the parent table;
Count the quantity of all sublists under the parent table;
The processing cost number using conventional data table described in the processing cost calculation of characteristic parameters
According to sub-step further comprise:
The parent table quantity relied on using the conventional data table, the conventional data table is to the father
The scanning amount of table, and, the quantity of all sublists under the parent table calculates the first scanning cost
Parameter.
4. method according to claim 3, it is characterised in that the processing cost feature ginseng
Number also includes first and calculates cost parameter, and, the first carrying cost parameter, described in the extraction
The sub-step of the processing cost characteristic parameter of the conventional data table of data common layer further comprises:
The complexity CU of the conventional data table is extracted as the first calculating cost parameter;
The amount of storage of the conventional data table is extracted as the first carrying cost parameter.
5. the method according to claim 3 or 4, it is characterised in that adopted by equation below
The parent table quantity relied on the conventional data table, and, the conventional data table is to the father
The scanning amount of table, and, the quantity of all sublists under the parent table calculates the first scanning cost
Parameter:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
6. method according to claim 5, it is characterised in that institute is used by equation below
State the processing cost data of conventional data table described in processing cost calculation of characteristic parameters:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
7. the method according to Claims 2 or 3 or 4, it is characterised in that described according to institute
The processing cost data of conventional data table are stated, the use cost data of the external data table are calculated
Step is,
According to the processing cost characteristic parameter of the conventional data table, the external data table is calculated
Use cost data.
8. method according to claim 7, it is characterised in that described according to the general number
According to the processing cost characteristic parameter of table, the use cost data step bag of the external data table is calculated
Include:
Extract being processed into for the conventional data table that the external data table of the non-data common layer is relied on
Eigen parameter;
Joined using the use cost feature of external data table described in the processing cost calculation of characteristic parameters
Number;
Using the use cost data of external data table described in the use cost calculation of characteristic parameters.
9. method according to claim 8, it is characterised in that the use cost feature ginseng
Number includes the second calculating cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the conventional data table that the external data table is relied on first calculates cost parameter;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter includes:
Obtain the calculating cost calculation between the external data table and its conventional data table relied on
The factor;
Cost parameter is calculated using described in the calculating cost calculation factor correction first, second is obtained
Calculate cost parameter.
10. method according to claim 9, it is characterised in that the use cost feature
Parameter includes the second carrying cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the first carrying cost parameter of the conventional data table that the external data table is relied on;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter also includes:
The carrying cost obtained between the external data table and its conventional data table relied on is calculated
The factor;
First carrying cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Carrying cost parameter.
11. method according to claim 10, it is characterised in that the use cost feature
Parameter includes the second scanning cost parameter;
Relied on conventional data table of the external data table for extracting the non-data common layer plus
The sub-step of work cost feature parameter is:
Extract the first scanning cost parameter of the conventional data table that the external data table is relied on;
The use cost using external data table described in the processing cost calculation of characteristic parameters is special
The step of levying parameter also includes:
Obtain the scanning cost calculation between the external data table and its conventional data table relied on
The factor;
First scanning cost parameter described in factor correction is calculated using the carrying cost, second is obtained
Scan cost parameter.
12. method according to claim 9, it is characterised in that obtain the external data
The sub-step of the calculating cost calculation factor between table and its conventional data table relied on is further wrapped
Include:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
The conventional data table was carried out according to every day in described nearest m days using equation below
The number of the tables of data of scanning, and, the conventional data table average sublist number of nearest m days,
Calculate the cost calculation factor:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
13. method according to claim 10, it is characterised in that obtain the external data
The sub-step that carrying cost between table and its conventional data table relied on calculates the factor is further wrapped
Include:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
The scanning of the conventional data table relied on using equation below according to the external data table it
Amount, and, there are k tables of dependence with the conventional data table, calculate carrying cost
Calculate the factor:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
14. method according to claim 11, it is characterised in that obtain the external data
The sub-step of the Sao Miao cost calculation factor between table and its conventional data table relied on is further wrapped
Include:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer, the temperature field is the quilt in certain time period
The number of times used is more than the field of the direct downstream data table quantity of the conventional data table;
Using equation below according to the ratio shared by temperature field in the conventional data table, and,
Level of the conventional data table in current data common layer, calculates the scanning cost calculation factor:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
15. the method according to claim 12 or 13 or 14, it is characterised in that by such as
Lower formula uses the use cost number of external data table described in the use cost calculation of characteristic parameters
According to:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
16. according to claim 1 or 2 or 3 or 4 or 6 or 8 or 9 or 10 or 11 or 12 or
Method described in 13 or 14, it is characterised in that also include:
When the processing cost data meet the first preparatory condition, corresponding conventional data table is extracted.
17. method according to claim 16, it is characterised in that described to be processed into when described
When notebook data meets the first preparatory condition, the step of extracting corresponding conventional data table includes:
If the first carrying cost parameter of certain conventional data table and the first ratio for calculating cost parameter
Higher than the first predetermined threshold value, then the conventional data table is extracted;
And/or,
If the first of certain conventional data table, which calculates cost parameter, is higher than the second predetermined threshold value, extract
Go out the conventional data table;
And/or,
If the ratio of the first scanning cost parameter of certain conventional data table and the first calculating cost parameter
Higher than the 3rd predetermined threshold value, then the conventional data table is extracted;
And/or,
The presence of statistics and certain conventional data table directly relies on the second meter of the external data table of relation
Calculate cost parameter sum;
If the first of the conventional data table, which calculates cost parameter, is more than the described second calculating cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second depositing for the external data table of relation
Store up cost parameter sum;
If the first carrying cost parameter of the conventional data table is more than the second carrying cost parameter
Sum, then extract the conventional data table;
And/or,
Presence that statistics opens conventional data table with certain directly relies on second sweeping for the external data table of relation
Retouch cost parameter sum;
If the first scanning cost parameter of the conventional data table is more than the described second scanning cost parameter
Sum, then extract the conventional data table.
18. according to claim 1 or 2 or 3 or 4 or 6 or 8 or 9 or 10 or 11 or 12 or
Method described in 13 or 14 or 17, it is characterised in that also include:
When the use cost data meet the second preparatory condition, corresponding external data table is extracted.
19. method according to claim 18, it is characterised in that described to be processed into when described
When notebook data meets the second preparatory condition, the step of extracting corresponding external data table includes:
If the second carrying cost parameter of certain external data table and the second ratio for calculating cost parameter
Higher than the 4th predetermined threshold value, then the external data table is extracted;
And/or,
If certain external data table can be obtained and current conventional data table phase from other conventional data tables
Same data, and the second scanning cost parameter when obtaining data by other conventional data tables is small
The second scanning cost parameter when data are obtained from current conventional data table, then extract described outer
Portion's tables of data.
20. a kind of device of spreadsheet analysis processing, it is characterised in that the tables of data includes number
According to the conventional data table of common layer, and, the external data table of non-data common layer, described dress
Put including:
Processing cost computing module, calculates for the conventional data meter for the data common layer and adds
Work cost data;
Determining module, for determining that it is general that the external data table of the non-data common layer is relied on
Tables of data;
Use cost computing module, for the processing cost data according to the conventional data table, meter
Calculate the use cost data of the external data table.
21. device according to claim 20, it is characterised in that the processing cost is calculated
Module includes:
Processing cost characteristic parameter extraction submodule, the general number for extracting the data common layer
According to the processing cost characteristic parameter of table;
Processing cost calculating sub module, for leading to using described in the processing cost calculation of characteristic parameters
With the processing cost data of tables of data.
22. device according to claim 21, it is characterised in that the processing cost feature
Parameter includes the first scanning cost parameter, and the processing cost characteristic parameter extraction submodule is further
Including:
Parent table quantity statistics unit, the quantity for counting the parent table that the conventional data table is relied on;
Scanning amount acquiring unit, for obtaining scanning amount of the conventional data table to the parent table;
Sublist quantity statistics unit, the quantity for counting all sublists under the parent table;
The processing cost calculating sub module further comprises:
First scanning cost computing unit, for the parent table number relied on using the conventional data table
Amount, the conventional data table to the scanning amount of the parent table, and, all sublists under the parent table
Quantity, calculate the first scanning cost parameter.
23. device according to claim 22, it is characterised in that the processing cost feature
Parameter also includes first and calculates cost parameter, and, the first carrying cost parameter is described to be processed into
Eigen parameter extraction submodule further comprises:
First calculates cost parameter extraction unit, the complexity CU for extracting the conventional data table
Cost parameter is calculated as first;
First carrying cost parameter extraction unit, the amount of storage for extracting the conventional data table is made
For the first carrying cost parameter.
24. the device according to claim 22 or 23, it is characterised in that by following public
Formula uses the parent table quantity that the conventional data table is relied on, and, the conventional data table is to institute
The scanning amount of parent table is stated, and, the quantity of all sublists under the parent table calculates the first scanning
Cost parameter:
Wherein, Cost (j) is tables of data j processing cost data,
The m parent tables that tables of data j is relied on by tables of data i, numbering is 1 ... m,
ScanSize (i, j) is scanning amounts of the conventional data table i to parent table j,
Tables of data m is parent table j all sublists, numbering is 1 ... n.
25. device according to claim 24, it is characterised in that used by equation below
The processing cost data of conventional data table described in the processing cost calculation of characteristic parameters:
Wherein, ComputeCost (i) calculates cost parameter for the first of conventional data table i;
StorageCost (i) is conventional data table i the first carrying cost parameter;
ScanCost (i, j) is that conventional data table i scans cost parameter to the first of parent table j.
26. the device according to claim 21 or 22 or 23, it is characterised in that described to make
Included with cost calculation module:
Use cost calculating sub module, for the processing cost feature ginseng according to the conventional data table
Number, calculates the use cost data of the external data table.
27. device according to claim 26, it is characterised in that the use cost is calculated
Submodule includes:
Processing cost characteristic parameter extraction unit, the external number for extracting the non-data common layer
The processing cost characteristic parameter of the conventional data table relied on according to table;
Use cost calculation of characteristic parameters unit, for using the processing cost calculation of characteristic parameters
The use cost characteristic parameter of the external data table;
Use cost Data Computation Unit, for using described in the use cost calculation of characteristic parameters
The use cost data of external data table.
28. device according to claim 27, it is characterised in that the use cost feature
Parameter includes second and calculates cost parameter;
The processing cost characteristic parameter extraction unit includes:
First calculating cost parameter shifts to an earlier date subelement, is relied on for extracting the external data table
The first of conventional data table calculates cost parameter;
The use cost calculation of characteristic parameters unit includes:
Calculate the cost calculation factor and obtain subelement, for obtain the external data table and its institute according to
The calculating cost calculation factor between bad conventional data table;
Second calculates cost parameter computation unit, for using the calculating cost calculation factor school
Just described first calculates cost parameter, obtains second and calculates cost parameter.
29. device according to claim 28, it is characterised in that the use cost feature
Parameter includes the second carrying cost parameter;
The processing cost characteristic parameter extraction unit includes:
First carrying cost parameter extraction subelement, is relied on for extracting the external data table
First carrying cost parameter of conventional data table;
The use cost calculation of characteristic parameters unit also includes:
Carrying cost calculate the factor obtain subelement, for obtain the external data table with its institute according to
Carrying cost between bad conventional data table calculates the factor;
Second carrying cost parameter computation unit, for calculating factor school using the carrying cost
Just described first carrying cost parameter, obtains the second carrying cost parameter.
30. device according to claim 29, it is characterised in that the use cost feature
Parameter includes the second scanning cost parameter;
The processing cost characteristic parameter extraction unit includes:
First scanning cost parameter extraction subelement, is relied on for extracting the external data table
First scanning cost parameter of conventional data table;
The use cost calculation of characteristic parameters unit also includes:
Scan the cost calculation factor and obtain subelement, for obtain the external data table and its institute according to
The scanning cost calculation factor between bad conventional data table;
Second scanning cost parameter computation unit, for calculating factor school using the carrying cost
Just described first scanning cost parameter, obtains second and scans cost parameter.
31. device according to claim 28, it is characterised in that the calculating cost calculation
The factor obtains subelement and is further used for:
Obtain the number of the tables of data that every day is over-scanned to the conventional data table in nearest m days
Mesh, and, the conventional data table average sublist number of nearest m days;
The conventional data table was carried out according to every day in described nearest m days using equation below
The number of the tables of data of scanning, and, the conventional data table average sublist number of nearest m days,
Calculate the cost calculation factor:
Wherein, m is every day in nearest m days;
Scanm (j) is the tables of data number over-scanned to conventional data table j for the m days;
Denominator is the example of the conventional data table j average sublist numbers of nearest 90 days.
32. device according to claim 29, it is characterised in that the carrying cost is calculated
The factor obtains subelement and is further used for:
The scanning amount for the conventional data table that the external data table is relied on it is obtained, and, with
There are k tables of dependence in the conventional data table;
The scanning of the conventional data table relied on using equation below according to the external data table it
Amount, and, there are k tables of dependence with the conventional data table, calculate carrying cost
Calculate the factor:
Wherein, scansize (i, j) is scanning amounts of the external data table i to conventional data table j;
M is the k tables that there is dependence with conventional data table j, for numbering 1 ... k.
33. device according to claim 30, it is characterised in that the scanning cost calculation
The factor obtains subelement and is further used for:
The ratio shared by temperature field in the conventional data table is obtained, and, the conventional data
Dependence level of the table in current data common layer, the temperature field is the quilt in certain time period
The number of times used is more than the field of the direct downstream data table quantity of the conventional data table;
Using equation below according to the ratio shared by temperature field in the conventional data table, and,
Level of the conventional data table in current data common layer, calculates the scanning cost calculation factor:
Wherein, hot_ratio (j) accounts for total Field Count in table for the quantity of conventional data table j temperature field
The ratio of amount;
Level (j) is dependence levels of the conventional data table j in data common layer.
34. the device according to claim 31 or 32 or 33, it is characterised in that by such as
Lower formula uses the use cost number of external data table described in the use cost calculation of characteristic parameters
According to:
Cost (i, j)=compcost (j) * compfac (i, j)+storcost (j) * storfac (j)+scancost (j) * scanfac (i, j)
Wherein, i is external data table, and j is conventional data table, is deposited between tables of data i and tables of data j
In dependence;
Cost (i, j) is the use cost data that external data table i uses conventional data table j;
Compcost (j) calculates cost parameter for first in conventional data table j processing cost data;
Compfac (i, j) between external data table i and conventional data table j calculating cost calculation because
Son;
Storcost (j) is the first carrying cost parameter in conventional data table j processing cost data;
Storfac (i, j) calculates the factor for the carrying cost between external data table i and conventional data table j;
Scancost (j) is the first scanning cost parameter in conventional data table j processing cost data;
Scanfac (i, j) is the scanning cost calculation factor between external data table i and conventional data table j.
35. according to claim 20 or 21 or 22 or 23 or 25 or 27 or 28 or 29 or 30
Or the device described in 31 or 32 or 33, it is characterised in that also include:
First extraction module, for when the processing cost data meet the first preparatory condition, carrying
Take corresponding conventional data table.
36. device according to claim 35, it is characterised in that first extraction module
Including:
First extracting sub-module, for the first carrying cost parameter in certain conventional data table and the
When one ratio for calculating cost parameter is higher than the first predetermined threshold value, the conventional data table is extracted;
And/or,
Second extracting sub-module, for being higher than in the first calculating cost parameter of certain conventional data table
During the second predetermined threshold value, the conventional data table is extracted;
And/or,
3rd extracting sub-module, for the first scanning cost parameter in certain conventional data table and the
When one ratio for calculating cost parameter is higher than three predetermined threshold values, the conventional data table is extracted;
And/or,
4th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table second calculate cost parameter sum;
4th extracting sub-module, is more than for the first calculating cost parameter in the conventional data table
During the second calculating cost parameter sum, the conventional data table is extracted;
And/or,
5th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table the second carrying cost parameter sum;
5th extracting sub-module, is more than for the first carrying cost parameter in the conventional data table
During the second carrying cost parameter sum, the conventional data table is extracted;
And/or,
6th statistic submodule, the presence for counting with certain conventional data table directly relies on relation
External data table second scanning cost parameter sum;
6th extracting sub-module, is more than for the first scanning cost parameter in the conventional data table
During the second scanning cost parameter sum, the conventional data table is extracted.
37. according to claim 20 or 21 or 22 or 23 or 25 or 27 or 28 or 29 or 30
Or the device described in 31 or 32 or 33 or 36, it is characterised in that also include:
Second extraction module, for when the use cost data meet the second preparatory condition, carrying
Take corresponding external data table.
38. the device according to claim 37, it is characterised in that second extraction module
Including:
7th extracting sub-module, for the second carrying cost parameter in certain external data table and the
When two ratios for calculating cost parameter are higher than four predetermined threshold values, the external data table is extracted;
And/or,
8th extracting sub-module, for that can be obtained in certain external data table from other conventional data tables
With current conventional data table identical data, and by other conventional data tables obtain data when
Second scanning cost parameter be less than from current conventional data table obtain data when second scanning cost
During parameter, the external data table is extracted.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610042109.0A CN106991101B (en) | 2016-01-21 | 2016-01-21 | Data table analysis processing method and device |
PCT/CN2017/070977 WO2017124959A1 (en) | 2016-01-21 | 2017-01-12 | Method and device for use in analyzing data table |
EP17740990.1A EP3407212A4 (en) | 2016-01-21 | 2017-01-12 | Method and device for use in analyzing data table |
TW106101915A TW201732641A (en) | 2016-01-21 | 2017-01-19 | Method and device for use in analyzing data table |
US16/041,336 US10909481B2 (en) | 2016-01-21 | 2018-07-20 | Method and apparatus for analyzing data table |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610042109.0A CN106991101B (en) | 2016-01-21 | 2016-01-21 | Data table analysis processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106991101A true CN106991101A (en) | 2017-07-28 |
CN106991101B CN106991101B (en) | 2021-02-02 |
Family
ID=59361344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610042109.0A Active CN106991101B (en) | 2016-01-21 | 2016-01-21 | Data table analysis processing method and device |
Country Status (5)
Country | Link |
---|---|
US (1) | US10909481B2 (en) |
EP (1) | EP3407212A4 (en) |
CN (1) | CN106991101B (en) |
TW (1) | TW201732641A (en) |
WO (1) | WO2017124959A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110517009A (en) * | 2019-07-29 | 2019-11-29 | 阿里巴巴集团控股有限公司 | Real-time common layer building method, device and server |
WO2021174945A1 (en) * | 2020-10-21 | 2021-09-10 | 平安科技(深圳)有限公司 | Data cost calculation method, system, computer device, and storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110457329B (en) * | 2019-08-16 | 2022-05-06 | 第四范式(北京)技术有限公司 | Method and device for realizing personalized recommendation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060253473A1 (en) * | 2005-05-06 | 2006-11-09 | Microsoft Corporation | Integrating vertical partitioning into physical database design |
US20130031064A1 (en) * | 2009-12-22 | 2013-01-31 | At&T Intellectual Property I, L.P. | Compressing Massive Relational Data |
CN104899209A (en) * | 2014-03-05 | 2015-09-09 | 阿里巴巴集团控股有限公司 | Optimization method and device for open type data processing service |
US20150347473A1 (en) * | 2014-05-29 | 2015-12-03 | International Business Machines Corporation | Database partition |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5995958A (en) * | 1997-03-04 | 1999-11-30 | Xu; Kevin Houzhi | System and method for storing and managing functions |
US7260563B1 (en) * | 2003-10-08 | 2007-08-21 | Ncr Corp. | Efficient costing for inclusion merge join |
US8280876B2 (en) * | 2007-05-11 | 2012-10-02 | Nec Corporation | System, method, and program product for database restructuring support |
CN100483395C (en) * | 2007-05-25 | 2009-04-29 | 金蝶软件(中国)有限公司 | Electronic data table calculation chain generation method and device |
US9020910B2 (en) * | 2010-01-13 | 2015-04-28 | International Business Machines Corporation | Storing tables in a database system |
CN102436494B (en) * | 2011-11-11 | 2013-05-01 | 中国工商银行股份有限公司 | Device and method for optimizing execution plan and based on practice testing |
US9171158B2 (en) * | 2011-12-12 | 2015-10-27 | International Business Machines Corporation | Dynamic anomaly, association and clustering detection |
US10019478B2 (en) * | 2013-09-05 | 2018-07-10 | Futurewei Technologies, Inc. | Mechanism for optimizing parallel execution of queries on symmetric resources |
-
2016
- 2016-01-21 CN CN201610042109.0A patent/CN106991101B/en active Active
-
2017
- 2017-01-12 EP EP17740990.1A patent/EP3407212A4/en not_active Withdrawn
- 2017-01-12 WO PCT/CN2017/070977 patent/WO2017124959A1/en active Application Filing
- 2017-01-19 TW TW106101915A patent/TW201732641A/en unknown
-
2018
- 2018-07-20 US US16/041,336 patent/US10909481B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060253473A1 (en) * | 2005-05-06 | 2006-11-09 | Microsoft Corporation | Integrating vertical partitioning into physical database design |
US20130031064A1 (en) * | 2009-12-22 | 2013-01-31 | At&T Intellectual Property I, L.P. | Compressing Massive Relational Data |
CN104899209A (en) * | 2014-03-05 | 2015-09-09 | 阿里巴巴集团控股有限公司 | Optimization method and device for open type data processing service |
US20150347473A1 (en) * | 2014-05-29 | 2015-12-03 | International Business Machines Corporation | Database partition |
CN105224536A (en) * | 2014-05-29 | 2016-01-06 | 国际商业机器公司 | The method and apparatus of partition database |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110517009A (en) * | 2019-07-29 | 2019-11-29 | 阿里巴巴集团控股有限公司 | Real-time common layer building method, device and server |
CN110517009B (en) * | 2019-07-29 | 2023-01-24 | 创新先进技术有限公司 | Real-time public layer construction method and device and server |
WO2021174945A1 (en) * | 2020-10-21 | 2021-09-10 | 平安科技(深圳)有限公司 | Data cost calculation method, system, computer device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2017124959A1 (en) | 2017-07-27 |
TW201732641A (en) | 2017-09-16 |
US10909481B2 (en) | 2021-02-02 |
EP3407212A4 (en) | 2019-06-19 |
US20180349811A1 (en) | 2018-12-06 |
CN106991101B (en) | 2021-02-02 |
EP3407212A1 (en) | 2018-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Morrison et al. | On economic complexity and the fitness of nations | |
Liu et al. | Analyzing urban networks through the lens of corporate networks: A critical review | |
WO2015135321A1 (en) | Method and device for mining social relationship based on financial data | |
CN104809132B (en) | A kind of method and device obtaining network principal social networks type | |
CN108446291A (en) | The real-time methods of marking and points-scoring system of user credit | |
Van den Honert | Stochastic group preference modelling in the multiplicative AHP: A model of group consensus | |
CN106991101A (en) | A kind of method and apparatus of spreadsheet analysis processing | |
CN108901033A (en) | Base station method for predicting based on echo state network | |
Ruther et al. | Comparing the effects of an NLCD-derived dasymetric refinement on estimation accuracies for multiple areal interpolation methods | |
CN110135711A (en) | A kind of information management method and device | |
CN110838060A (en) | Financial risk measurement method and device and electronic equipment | |
CN111737569B (en) | Personalized recommendation method based on attribute perception intention-minded convolutional neural network | |
Boysen‐Urban et al. | Measuring the trade restrictiveness of domestic support using the EU Common agricultural policy as an example | |
CN110532093B (en) | Parallel task division method for multi-geometric-shape full core sub-channels of numerical nuclear reactor | |
CN108492009A (en) | Influence power evaluation system construction method and system, influence power evaluation method | |
Li et al. | Evolution of FDI flows in the global network: 2003–2012 | |
CN108647739A (en) | A kind of myspace discovery method based on improved density peaks cluster | |
Schmitz et al. | Efficient and quality contouring algorithms on the GPU | |
CN116701714A (en) | Data storage method, device, equipment and medium based on multi-way tree | |
Dobbie et al. | Quantifying uncertainty in environmental indices: an application to an estuarine health index | |
Zhiyuan et al. | Research on the evaluation of enterprise competitiveness based on the wavelet neural network forecasting system | |
CN117421462B (en) | Data processing method and device and electronic equipment | |
Zhang et al. | How do manufacturing and producer service agglomerations affect urban innovation differently? Empirical evidence from China | |
Zhang et al. | Analysis model design on the impact of foreign investment on China’s economic growth | |
CN115829144B (en) | Method for establishing power grid business optimization model and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20211110 Address after: Room 507, floor 5, building 3, No. 969, Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province Patentee after: Zhejiang tmall Technology Co., Ltd Address before: P.O. Box 847, 4th floor, Grand Cayman capital building, British Cayman Islands Patentee before: Alibaba Group Holdings Limited |
|
TR01 | Transfer of patent right |