CN110569313A

CN110569313A - Method and device for judging grade of model table of data warehouse

Info

Publication number: CN110569313A
Application number: CN201810475388.9A
Authority: CN
Inventors: 李建星
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-05-17
Filing date: 2018-05-17
Publication date: 2019-12-13
Anticipated expiration: 2038-05-17
Also published as: CN110569313B

Abstract

The embodiment of the invention provides a method and a device for judging the level of a model table of a data warehouse, electronic equipment and a storage medium, and relates to the technical field of databases. The method comprises the following steps: acquiring index data of a plurality of preset indexes of each model table in the data warehouse; dividing a plurality of model table levels, establishing model table sets corresponding to the model table levels and distributing a model table to each model table set to serve as an initial element; performing clustering operation on each model table according to each initial element, and distributing each model table to the corresponding model table set according to the result of the clustering operation; and distributing corresponding level identifications for each model table according to the distribution result of the model tables. The technical scheme of the embodiment of the invention can realize the automatic grading of the model table.

Description

Method and device for judging grade of model table of data warehouse

Technical Field

The present invention relates to the field of database technologies, and in particular, to a method and an apparatus for determining a model table level of a data warehouse, an electronic device, and a computer-readable storage medium.

background

at present, a data warehouse is mostly established in an enterprise and is used for different data requirements of daily data analysis, report forms, data mining and the like of enterprise business. The core of establishing the data warehouse is to construct a set of data model based on company business, construct data information of different business links into a final model table through a certain modeling method and theory through the data model, and further provide services such as data query, analysis, retrieval, data mining and the like.

The model of the current data warehouse is generally divided into a plurality of levels according to the processing sequence and granularity, but the importance of the model table is not divided according to different levels, and the model table level is a very valuable attribute in actual work. For example, in the system maintenance of a data warehouse, data backup needs to be performed on the model table, and measures such as priority guarantee, important monitoring, full backup and the like should be taken for the model table with a higher level. For another example, when allocating resources in the data warehouse operation, more resources, such as compute node resources, storage resources, network access concurrent resources, etc., should be appropriately allocated to the model table with a higher level. In general, through reasonably grading the model table, more effective management methods can be provided according to different levels, so that the stability and the use value of the system are comprehensively improved.

therefore, how to determine the model table level in a data warehouse becomes a technical problem to be solved urgently.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the invention and therefore may include information that does not constitute prior art that is already known to a person of ordinary skill in the art.

Disclosure of Invention

An object of embodiments of the present invention is to provide a method for determining a model table level of a data warehouse, an apparatus for determining a model table level of a data warehouse, an electronic device, and a computer-readable storage medium, which overcome one or more problems due to limitations and disadvantages of the related art, at least to some extent.

According to one aspect of the present disclosure, there is provided a method for determining a model table level of a data warehouse, including:

Acquiring index data of a plurality of preset indexes of each model table in the data warehouse;

Dividing a plurality of model table levels, establishing model table sets corresponding to the model table levels and distributing a model table to each model table set to serve as an initial element;

Performing clustering operation on each model table according to each initial element, and distributing each model table to the corresponding model table set according to the result of the clustering operation;

And distributing corresponding level identifications for each model table according to the distribution result of the model tables.

In an exemplary embodiment of the present disclosure, the preset index includes a plurality of fields, records, tasks, sources, queries, downloads, running scripts, updates, and comments.

In an exemplary embodiment of the present disclosure, assigning a model table as an initial element to each of the model table sets includes:

For each model table, calculating the sum of all the index data of the model table as first judgment data of the model table;

Sorting each model table according to each first judgment data;

And selecting the model tables distributed to each model table set in a preset order as initial elements according to the sequencing result of each model.

In an exemplary embodiment of the present disclosure, clustering each of the model tables according to each of the initial elements includes:

for each model table, calculating the distance between the model table and the vector centroid of each model table set;

Assigning the model table to a set of the model tables if the distance between the model table and a vector centroid of the set of the model tables is minimal;

and calculating the vector centroid of the model table set according to the index data of all the model tables in the model table set.

according to an aspect of the present disclosure, there is provided a model table level determination apparatus of a data warehouse, including:

The data acquisition module is used for acquiring index data of a plurality of preset indexes of each model table in the data warehouse;

the model table set initialization module is used for establishing model table sets corresponding to the model table levels and distributing a model table to each model table set to serve as an initial element;

The clustering operation module is used for carrying out clustering operation on each model table according to each initial element and distributing each model table to the corresponding model table set according to the result of the clustering operation;

And the grade output module is used for distributing corresponding grade identifications to the model tables according to the distribution results of the model tables.

sorting each model table according to each first judgment data;

According to an aspect of the present disclosure, there is provided an electronic device including:

A processor; and

a memory having computer readable instructions stored thereon which, when executed by the processor, implement a method of model table level determination for a data warehouse as in any above.

According to a fourth aspect of embodiments of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the model table level determination method for a data warehouse as described in the first aspect above.

In the technical solutions provided by some embodiments of the present invention, a clustering operation is performed on a model table based on index data of a plurality of indexes of each model table in a collected data warehouse, and a level of each model table is automatically determined according to a result of the clustering operation. Compared with the prior art, the grade judgment is carried out based on the indexes of the model table, so that the obtained judgment result is more objective and accurate, and the model tables of all data warehouses can be covered by selecting different indexes, so that the method has better universality. More specifically, on one hand, the technical scheme provided by the invention can avoid the problems of random judgment and low recognition rate of manual experience, has good business interpretability on the judgment standard of the model table level, and can reduce the dispute of manual judgment; on the other hand, the technical scheme provided by the invention has better universality, can be used for judging the model table level of all data warehouses, and reduces system management blind areas and operation and maintenance risks caused by models which cannot be identified manually; on the other hand, the technical scheme provided by the invention can realize automatic grading of the model table, so that the problem of evaluating the grade of the model table after the problem occurs in manual judgment can be solved by performing daily management, the risk in daily management and operation and maintenance of the data warehouse can be avoided, and the stability and application value of the data warehouse system can be improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 illustrates a model hierarchy architecture diagram of a data warehouse in accordance with one aspect;

FIG. 2 illustrates a flow diagram of a method for model table level determination of a data warehouse, in accordance with some embodiments of the invention;

FIG. 3 illustrates a flow diagram of a method for model table level determination of a data warehouse, in accordance with some embodiments of the invention;

FIG. 4 shows a schematic block diagram of a model table level decision apparatus of a data warehouse in accordance with an exemplary embodiment of the present invention;

FIG. 5 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device to implement an embodiment of the invention.

Detailed Description

example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations or operations have not been shown or described in detail to avoid obscuring aspects of the invention.

the block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

In the data warehouse in the prior art, the classification of the model table is generally not considered, and the relative importance of the model table is mostly evaluated through a manual experience method. For example, the order summary wide model table and the order detail wide model table are business model tables of the e-commerce system with orders as cores, so manual judgment is an important model table, and therefore, evaluation of the model table is relatively important and has a high level. And in turn, may give more attention to routine system maintenance and operation and maintenance management. While other model tables do not have uniform rules and methods to determine the model table level. For example:

FIG. 1 is a schematic diagram of a model hierarchy architecture of a data warehouse. It can be seen from fig. 1 that the data warehouse only layers the data flow of model table processing; the data are sequentially 1 layer, 2 layer, 3 layer and 4 layer from left to right, and basically flow from 1 layer to 4 layer (the specific number of layers can be set differently according to enterprise requirements). However, no distinction of model table levels is given for the model tables in each layer, i.e., the importance of any model table in the data warehouse cannot be generally judged and no method for evaluating the model table levels is provided.

In the existing technical scheme, the level of a model table is judged in a manual experience mode, different level labels are set for the model table according to the level of the model table judged in the manual experience mode, and then management and operation and maintenance work of a daily data warehouse is executed according to the level labels. For the judgment mode of manual experience, in actual work, the judgment is carried out according to whether core services exist or not, and the judgment is carried out according to the fact that the number of tasks used by a model table meets a certain rule, so that in a word, a scientific and reasonable judgment method which is in line with actual operation does not exist. Therefore, the existing method for determining the model table level by manual experience has the following disadvantages:

1) and because the method for manually judging the level of the model table has no reasonable rule, the identification rate is low, only a small part of the model table can be judged, and the judgment result has larger disputes due to human factors.

2) Some special model tables cannot be identified, for example, the special model tables are not core business model tables but have high utilization rate, so that important model tables are not identified, and system management blind areas and operation and maintenance risks are increased.

3) the manual judgment mode is mostly a mode of post judgment, namely, the model table has a problem in use, and the grade of the model table can be judged and analyzed after a certain influence is caused, so that the mode is passive and has poor actual effect.

Therefore, the prior art cannot form an effective method when determining the model table level, and the implementation of the method can cause potential risks to database management and operation and maintenance.

Based on the above, in the exemplary embodiment of the present invention, a method for determining a model table level of a data warehouse is first proposed. The method may be executed by a server or other electronic devices, which is not particularly limited in this exemplary embodiment. Referring to fig. 2, the method for determining the model table level of the data warehouse may include the following steps:

s210, acquiring index data of a plurality of preset indexes of each model table in the data warehouse;

step S220, dividing a plurality of model table levels, establishing model table sets corresponding to the model table levels and distributing a model table to each model table set to serve as an initial element;

Step S230, performing clustering operation on each model table according to each initial element, and distributing each model table to the corresponding model table set according to the result of the clustering operation;

and S240, distributing corresponding level identifications to each model table according to the distribution result of the model tables.

Compared with the prior art, the method for judging the level of the model table of the data warehouse in the example embodiment of fig. 2 is a level judgment based on the indexes of the model table, so that the obtained judgment result is more objective and accurate, and different indexes are selected to cover all the model tables of the data warehouse, so that the method has better universality. More specifically, on one hand, the technical scheme provided by the invention can avoid the problems of random judgment and low recognition rate of manual experience, has good business interpretability on the judgment standard of the model table level, and can reduce the dispute of manual judgment; on the other hand, the technical scheme provided by the invention has better universality, can be used for judging the model table level of all data warehouses, and reduces system management blind areas and operation and maintenance risks caused by models which cannot be identified manually; on the other hand, the technical scheme provided by the invention can realize automatic grading of the model table, so that the problem of evaluating the grade of the model table after the problem occurs in manual judgment can be solved by performing daily management, the risk in daily management and operation and maintenance of the data warehouse can be avoided, and the stability and application value of the data warehouse system can be improved.

The model table level decision method of the data warehouse in the exemplary embodiment of fig. 2 is described in detail below with reference to fig. 3.

In step S210, index data of a plurality of preset indexes of each model table in the data warehouse is collected.

In an example embodiment, the data warehouse includes all of the model tables, and the model tables may be hierarchically designed and managed. The data warehouse can design, generate, use and manage the model table by some methods such as Extract-Transform-Load (ETL) and metadata management. For each model table, it may have multiple indices describing it. For example, in this exemplary embodiment, the preset indexes may include indexes related to attributes of the model table, indexes related to associated tasks, indexes related to source-dependent data, indexes related to usage situations, and the like. Specifically, the preset index may include a plurality of fields, a number of records, a number of tasks, a number of source tables, a number of queries of the statistical period, a number of downloads of the statistical period, a running time of the script, an update number of the statistical period, a number of users using the statistical period, a number of comments of the user, and the like, which is not limited in the exemplary embodiment. The statistical period may be a week, a month, a year, etc., and this is not particularly limited in the present exemplary embodiment.

After the index data of a plurality of preset indexes of each model table in the data warehouse are collected, the index data can be integrated into a model judgment condition table so as to facilitate storage and query operation. For example, the fields of the model decision condition table may include: model table name, field number (M1), record number (ten) (M2), task number (M3), source table number (M4), month query times (M5), month download times (M6), script running time (minutes) (M7), month update times (M8), number of persons used per month (M9) and user comment number (M10); the structure of the model determination condition table is as shown in table 1:

TABLE1

name of model table	M1	M2	M3	M4	M5	M6	M7	M8	M9	M10
											table1	25	305	4	8	245	80	34	0	42	10
table2	71	205	3	3	525	46	14	1	33	6
											table3	20	85	4	4	352	35	41	0	21	9
table4	16	33	5	5	109	11	32	0	45	11
											table5	34	190	6	5	221	13	23	0	63	4
table6	31	234	7	3	176	14	11	0	51	6
											table7	29	53	2	6	240	17	21	0	35	14
table8	19	43	4	4	460	37	51	0	25	34
											table9	39	73	5	4	405	22	31	0	37	12
table10	22	66	5	6	266	44	22	0	53	21
											……	……	……	……	……	……	……	……	……	……	……

In step S220, a plurality of model table levels are divided, and a model table set corresponding to the model table levels is established and a model table is allocated to each model table set as an initial element.

In the present exemplary embodiment, a plurality of model table levels may be divided according to requirements; for example, 2 model table levels, 3 model table levels, 4 model table levels, 5 model table levels, and so on are divided. Taking 4 levels of model tables as an example, all the model tables can be divided into A, B, C, D four levels, which respectively correspond to the importance of the model: "ultra high", "medium", "low". Correspondingly, different model table levels can correspond to different daily management measures. Specifically, as shown in Table 2:

TABLE2

model table level	Degree of importance of model	Daily management measures
			A	super high	Resource priority, 7 × 24 monitoring, advanced disaster recovery and special person responsibility
B	Height of	Resource priority, 7 × 24 monitoring, normal disaster recovery and special person responsibility
			C	in	7 x 24 monitoring, normal disaster recovery and duty cycle
D	Is low in	7 x 24 monitoring, normal disaster recovery

after the model table levels are divided, a model table set corresponding to the model table levels can be established. For example, the set of model tables corresponding to the model table level a is set a; the model table set corresponding to the model table level B is a set B; the set of the model tables corresponding to the model table level C is a set C; and the model table set corresponding to the model table level D is a set D.

referring to fig. 3, in this exemplary embodiment, allocating a model table as an initial element to each model table set may include the following steps:

step S310, for each model table, calculating the sum of all the index data of the model table as the first judgment data of the model table. Taking the model determination condition table shown in table1 as an example, the first determination data x of each model table is calculated_iThe method of (3) may be as follows:

wherein i represents each model table; 1,2,3 … …, n; n represents the total number of model tables. After the calculation is finished, the set X is obtained as the X_i}。

Of course, in other exemplary embodiments of the present disclosure, the first determination data may be calculated in other manners; for example, weighted summation is performed on all the index data, or other processing such as multiplication and exponentiation is performed to obtain first judgment data; these too are within the scope of the present disclosure.

Step S320, sorting each model table according to each first judgment data.

For example, the first determination data corresponding to the model tables table1, table2, table3, table4, table5, table6, table7, table8, table9 and table10 in the above table2 are 753, 907, 571, 267, 559, 533, 417, 677, 628 and 505, respectively; the corresponding ranks are 2, 1, 5, 10, 6, 7, 9, 3, 4, 8. Specifically, as shown in table 3:

TABLE3

name of model table	M1	M2	……	M10	X	x_ivalue of (A)	Ranking
								table1	25	305	……	10	x₁	753	2
table2	71	205	……	6	x₂	907	1
								table3	20	85	……	9	x₃	571	5
table4	16	33	……	11	x₄	267	10
								table5	34	190	……	4	x₅	559	6
table6	31	234	……	6	x₆	533	7
								table7	29	53	……	14	x₇	417	9
table8	19	43	……	34	x₈	677	3
								table9	39	73	……	12	x₉	628	4
table10	22	66	……	21	x₁₀	505	8

s330, selecting the distributed model tables in a preset order as initial elements according to the sorting result of each model. For example:

For set A, corresponding to model table level A, the initial element in set A is denoted as a₁. In the present exemplary embodiment, a may be taken₁＝argmax(x_iX); that is, the model table corresponding to the xi with the maximum value in the set X is taken as the initial element.

For set B, corresponding to model table level B, the initial element in set B is denoted as B₁. In the present exemplary embodiment, b may be taken₁＝argtop30％(x_ix); namely, X ranked at 30% rank in the set X is taken_ithe corresponding model table is the initial element.

for set C, corresponding to model table level C, the initial element in set C is denoted as C₁. In the present exemplary embodiment, c may be taken₁＝argtop70％(x_iX); namely, X ranked at 70% rank in the set X is taken_ithe corresponding model table is the initial element.

For set D, corresponding to model table level D, the initial element in set D is denoted as D₁. In this example embodiment, d may be taken₁＝argmin(x_iX); i.e. taking the smallest value X in the set X_ithe corresponding model table is the initial element.

In the numbers of Table3By way of example, first judgment data x of each model table is calculated_iafter side-by-side naming, the values of (c) can be found:

For set A, the initial element takes rank 1, i.e., the corresponding model table for the maximum value in set X, as the initial element a₁I.e. a₁table 2. Accordingly, the initial set a { (a)₁＝table2)}。

for set B, the initial element takes rank 30%, X for rank 3 in set X_iCorresponding model table as initial element b₁I.e. b₁Table 8. Accordingly, the initial set B { (B)₁＝table8)}。

For set C, the initial element takes rank 70%, X for rank 7 in set X_iCorresponding model table as initial element c₁I.e. c₁Table 6. Accordingly, the initial set C { (C)₁＝table6)}。

For set D, the initial element takes the rank 10, i.e., the corresponding model table for the minimum value in set X, as the initial element D₁i.e. d₁Table 4. Accordingly, the initial set D { (D)₁＝table4)}。

in step S230, a clustering operation is performed on each model table according to each initial element, and each model table is allocated to the corresponding model table set according to a result of the clustering operation.

In this example embodiment, the clustering operation may be performed in conjunction with computing the vector centroids of the set of model tables. For example, referring to fig. 3, performing a clustering operation on each model table according to each initial element may include the following steps:

Step S340, for each model table, calculating the distance between the model table and the vector centroid of each model table set.

in this exemplary embodiment, for each model table set, the vector centroid thereof may be calculated according to the index data of all model tables in the model table set.

for example, assume that the number of model tables in the set a corresponding to the a level is o, assume that the number of model tables in the set B corresponding to the B level is p, assume that the number of model tables in the set C corresponding to the C level is k, and assume that the number of model tables in the set D corresponding to the D level is m. In each model table set, each model table element is a 10-dimensional vector whose coordinate values are, in turn, the fields M1, M2 …, and M10 of the model decision condition decision table in table1 above. The generalizations for set a, set B, set C, set D are therefore as follows:

A＝{a₁,a₂,...,a_o}a_i∈Rⁿ(i＝1,2,...,o)

B＝{b₁,b₂,...,b_p}b_i∈Rⁿ(i＝1,2,...,p)

C＝{c₁,c₂,...,c_k}c_i∈Rⁿ(i＝1,2,...,k)

D＝{d₁,d₂,...,d_m}d_i∈Rⁿ(i＝1,2,...,m)

wherein n is 10, Rⁿthe representation is a 10-dimensional vector space. Of course, if the determination condition determination table includes other numbers of fields, the dimensions of each model table element correspond to other numbers, which is not particularly limited in this exemplary embodiment.

After generalized representations of the set A, the set B, the set C and the set D are obtained, the vector centroids mu of the set A, the set B, the set C and the set D are obtained_a、μ_b、μ_c、μ_dCan be calculated by the following formula:

That is, in the present exemplary embodiment, the vector centroid of the model table set is calculated by calculating the average value of the vector positions corresponding to all the elements in the set, and the resulting μ_a、μ_b、μ_c、μ_dAre all 10-dimensional vectors. However, it is easily understood by those skilled in the art that in other exemplary embodiments of the present disclosure, the centroid of the model table set may be calculated in other manners, and the present exemplary embodiment is not limited thereto.

After calculating the vector centroids of the model table sets, for each model table, the vector N of the model table may be taken, and the vector centroids μ of the vector N and the vector centroids μ of the set a, the set B, the set C, and the set D are calculated_a、μ_b、μ_c、μ_ddis _ a, Dis _ b, Dis _ c, Dis _ d. For example:

Dis_a＝||N-μ_a||²

Dis_b＝||N-μ_b||²

Dis_c＝||N-μ_c||²

Dis_d＝||N-μ_d||²

wherein | X-Y | is the root number of the sum of squares of the components after the vector is differenced.

Note that, in the present exemplary embodiment, the euclidean distance is calculated, but in other exemplary embodiments of the present disclosure, a mahalanobis distance, a cosine distance, a manhattan distance, or the like may also be calculated; these too are within the scope of the present disclosure.

step S350, if the distance between the model table and the vector centroid of the model table set is minimum, the model table is distributed to the model table set. The minimum distance may be determined, for example, by:

Min(Dis_a，Dis_b，Dis_c，Dis_d)

For example, for the model table1 described above, which has the smallest distance to the vector centroid of the model table set a, the model table1 is assigned to the model table set a; for the model table2 above, which has the smallest distance to the vector centroid of the model table set D, then model table2 is assigned to the model table set D, and so on.

and S360, calculating the vector centroid of the model table set according to the index data of all the model tables in the model table set.

That is, after adding a new element to a model table set, its vector centroid needs to be recalculated. In this exemplary embodiment, the vector centroid can be recalculated by the method in step S340, and can also be calculated by the following formula:

if a new model table is added in the set A, the vector centroid of the set A is updated as follows:

o＝o+1

If a new model table is added in the set B, the vector centroid of the set B is updated as follows:

p＝p+1

if a new model table is added to the set C, the vector centroid of the set C is updated as follows:

k＝k+1

If a new model table is added in the set D, the vector centroid of the set D is updated as follows:

m＝m+1

And then, iterating the steps S340 to S360 until all model tables of the model judgment condition table are judged to be finished, and finally obtaining A, B, C, D four sets, namely judging the grading results of all models in the data warehouse.

In step S240, a corresponding level identifier is assigned to each model table according to the model table assignment result.

in this exemplary embodiment, the level identifier may be a level tag, a level score, or the like; taking the class label as an example, the output result can be shown in the following table 4:

TABLE4

model watch	Model table level
		table1	A
table2	D
		table3	C
table4	B
		table5	B
table6	C
		……	……

based on the obtained level judgment result, different daily management measures can be pertinently taken for model tables of different levels, so that risks in daily management and operation and maintenance of the data warehouse can be avoided, and the stability and application value of the data warehouse system are improved.

In addition, in the embodiment of the invention, a model table level judgment device of the data warehouse is also provided. Referring to fig. 4, the model table level determination apparatus 400 of the data warehouse may include: a data acquisition module 410, a set initialization module 420, a clustering operation module 430, and a level output module 440. Wherein:

The data collection module 410 may be configured to collect index data of a plurality of preset indexes of each model table in the data warehouse.

The set initialization module 420 may be configured to divide a plurality of model table levels, establish a model table set corresponding to the model table levels, and allocate a model table to each model table set as an initial element.

the clustering operation module 430 may be configured to perform a clustering operation on each model table according to each initial element, and allocate each model table to the corresponding model table set according to a result of the clustering operation.

The level output module 440 may be configured to assign a corresponding level identifier to each model table according to the model table assignment result.

in some embodiments of the invention, based on the foregoing,

the preset indexes comprise a plurality of field numbers, record numbers, task numbers, source table numbers, statistics period query times, statistics period download times, script running time, statistics period update times, statistics period use number and user comment numbers.

In some embodiments of the present invention, based on the foregoing scheme, assigning a model table as an initial element to each of the model table sets may include:

sorting each model table according to each first judgment data;

In some embodiments of the invention, based on the foregoing,

Performing clustering operation on each model table according to each initial element comprises:

Since each functional module of the model table level determination apparatus 400 of the data warehouse according to the exemplary embodiment of the present invention corresponds to the step of the above-described exemplary embodiment of the model table level determination method of the data warehouse, it is not described herein again.

In an exemplary embodiment of the present invention, there is also provided an electronic device capable of implementing the above method.

referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing an electronic device of an embodiment of the present invention. The computer system 500 of the electronic device shown in fig. 5 is only an example, and should not bring any limitation to the function and the scope of the use of the embodiments of the present invention.

As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for system operation are also stored. The CPU501, ROM 502, and RAM 503 are connected to each other via a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.

In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 501.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

the units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the model table level determination method for a data warehouse as described in the above embodiments.

for example, the electronic device may implement the following as shown in fig. 2: s210, acquiring index data of a plurality of preset indexes of each model table in the data warehouse; step S220, dividing a plurality of model table levels, establishing model table sets corresponding to the model table levels and distributing a model table to each model table set to serve as an initial element; step S230, performing clustering operation on each model table according to each initial element, and distributing each model table to the corresponding model table set according to the result of the clustering operation; and S240, distributing corresponding level identifications to each model table according to the distribution result of the model tables.

It should be noted that although in the above detailed description several modules or units of a device or apparatus for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the invention. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiment of the present invention.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

it will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for judging the level of a model table of a data warehouse is characterized by comprising the following steps:

2. The method for determining the model table level of the data warehouse according to claim 1, wherein the preset index includes a plurality of fields, records, tasks, sources, queries, downloads, scripts, updates, users and comments.

3. The method of claim 1, wherein assigning a model table as an initial element to each of the sets of model tables comprises:

sorting each model table according to each first judgment data;

4. the method for determining the model table level of a data warehouse according to any one of claims 1 to 3, wherein clustering each model table according to each initial element comprises:

5. An apparatus for determining a model table level of a data warehouse, comprising:

6. The model table level determination device of the data warehouse according to claim 5, wherein the preset index includes a plurality of fields, records, tasks, sources, queries, downloads, scripts running time, updates, users and comments.

7. The apparatus for determining the model table level of a data warehouse of claim 5, wherein assigning a model table as an initial element to each of the model table sets comprises:

Sorting each model table according to each first judgment data;

8. The apparatus for determining the model table level of a data warehouse according to any one of claims 5 to 7, wherein clustering each of the model tables according to each of the initial elements includes:

9. an electronic device, comprising:

A processor; and

A memory having stored thereon computer readable instructions which, when executed by the processor, implement a method of model table level determination for a data warehouse as claimed in any one of claims 1 to 4.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of model table level determination of a data warehouse according to any one of claims 1 to 4.