CN106909566A - A kind of Data Modeling Method and equipment - Google Patents
A kind of Data Modeling Method and equipment Download PDFInfo
- Publication number
- CN106909566A CN106909566A CN201510980569.3A CN201510980569A CN106909566A CN 106909566 A CN106909566 A CN 106909566A CN 201510980569 A CN201510980569 A CN 201510980569A CN 106909566 A CN106909566 A CN 106909566A
- Authority
- CN
- China
- Prior art keywords
- master meter
- data modeling
- data
- metadata
- business
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application discloses a kind of Data Modeling Method.Determined the master meter for data modeling and the business implication determination according to master meter by after the type of the object table that data modeling is generated according to the metadata of each source table, metadata according to master meter is determined for data modeling from table, and the field for being selected from master meter and from table for data modeling, data modeling is carried out finally according to master meter, from table and field, the object table is generated.Metadata so as to be based on tables of data carries out data modeling exactly, it is ensured that the accuracy and efficiency of data modeling result.
Description
Technical field
The application is related to communication technical field, more particularly to a kind of Data Modeling Method.The application goes back simultaneously
It is related to a kind of data modeling equipment.
Background technology
With the continuous development of network technology, database has in areas of information technology and is widely applied.
In the life breath that each department of social life nearly all has various databases in store with people
Related various data.In order to carry out unified management to data to provide preferably service, data warehouse
Arise at the historic moment.Data warehouse be a subject-oriented (Subject Oriented), integrated (Integrated),
The data acquisition system of metastable (Non-Volatile), reflecting history change (Time Variant), uses
In administrative decision (Decision Making Support) is supported, for analytical presentation and decision support mesh
And create.To need the enterprise of business intelligence, there is provided service guidance flow scheme improvements, monitoring the time, into
Originally, quality and control.
Data modeling is one of significant process of construction data warehouse, and data modeling is referred to real world
The abstract tissue of Various types of data, determine scope, organizational form of data that data warehouse need to be administered etc. until
Change into the data warehouse of reality.Tool treatment is carried out by by data warehouse model construction, can be solved
Certainly industry for a long time empirical modeling and people's meat modeling problem, while can preferably be taken in group internal
Business group data common layer is built and is optimized.
During data modeling is carried out, for business, uncomprehending data model teacher can typically enter first
The service condition investigation of row downstream, then carries out data modeling further according to finding.Due to using downstream
Survey and Inquiry needs to expend substantial amounts of manpower, therefore which efficiency is low and investigation is insufficient, so as to cause
Get half the result with twice the effort.And the data model teacher understood for business typically uses the modeling pattern based on experience.So
And which is due to the guidance without digitization, therefore the degree of accuracy of modeling cannot be ensured.
As can be seen here, how data modeling treatment quickly is carried out on the premise of accuracy is ensured, as this
Art personnel technical problem urgently to be resolved hurrily.
The content of the invention
This application provides a kind of Data Modeling Method, it is used to improve the accurate fixed of data modeling and models
Efficiency.The method includes:
Metadata according to each source table determines the master meter for data modeling;
Business implication according to the master meter determines the type of the object table generated by the data modeling;
Metadata according to the master meter is determined for data modeling from table;
From the master meter and the field selected from table for data modeling;
According to the master meter, it is described carry out data modeling from table and the field, generate the object table.
Preferably, the business implication according to the master meter determines the object table generated by the data modeling
Type, specially:
If the business implication according to the master meter determines that the type of the object table is the true table, according to
The metadata of the master meter determines the particular type of the true table, and the particular type includes:Affairs type
True table, periodic snapshot fact table and accumulation snapshot fact table;
If the business implication according to the master meter determines that the type of the object table is the dimension table, according to institute
The metadata for stating master meter determines whether the dimension table needs to be split and fractionation mode, the fractionation side
Formula includes:Level splits and vertical fractionation.
Preferably, the metadata includes downstream use information, and the metadata according to the master meter determines to use
In data modeling from table, specially:
Obtained according to the downstream use information and have related tables of data with the master meter;
The related information between the master meter and each tables of data is obtained, and will be with default selection strategy
The corresponding tables of data of related information of matching is as described from table.
Preferably, from the master meter and the field selected from table for data modeling, specially:
The master meter and the field service condition information from table are obtained according to the metadata respectively;
The field is chosen according to the field service condition information;
Wherein, the field service condition information at least includes:Field Inquiry number of times, filter condition number of times,
Degree of incidence, aggregate statistics number of times, null value accounting, enumerated value accounting.
Preferably, according to the master meter, it is described carry out data modeling from table and the field before,
Also include:
When the object table is affairs type fact table, according to the downstream use information to the master
The business procedure of table carries out mark, it is determined that generation single event fact table or multiple affair fact table;
When the object table is accumulation snapshot fact table, according to affairs type fact table to the master
The business procedure of table carries out mark, and will be currently used in the business mistake of other true tables of the data modeling
Cheng Jinhang marks;
When the object table is the dimension table and the fractionation mode is that the level splits, according to described
The master meter level is split as multiple dimension tables by the field service condition information of master meter;
When the object table is the dimension table and the fractionation mode is the vertical fractionation, according to described
Master meter and each related information between table, by business change higher than predetermined threshold value from table with it is described
Master meter by the data modeling generate core dimension table, and by business change be not higher than predetermined threshold value from
Table generates self-defined dimension table by the data modeling.
Correspondingly, the application also proposed a kind of data modeling equipment, including:
First determining module, the metadata according to each source table determines the master meter for data modeling;
Second determining module, the business implication according to the master meter determines what is generated by the data modeling
The type of object table;
3rd determining module, the metadata according to the master meter is determined for data modeling from table;
Selecting module, from the master meter and the field selected from table for data modeling;
MBM, to the master meter, it is described carry out data modeling from table and the field, generate institute
State object table.
Preferably, second determining module specifically for:
If the business implication according to the master meter determines that the type of the object table is the true table, according to
The metadata of the master meter determines the particular type of the true table, and the particular type includes:Affairs type
True table, periodic snapshot fact table and accumulation snapshot fact table;
If the business implication according to the master meter determines that the type of the object table is the dimension table, according to institute
The metadata for stating master meter determines whether the dimension table needs to be split and fractionation mode, the fractionation side
Formula includes:Level splits and vertical fractionation.
Preferably, the metadata include downstream use information, the 3rd determining module specifically for:
Obtained according to the downstream use information and have related tables of data with the master meter;
The related information between the master meter and each tables of data is obtained, and will be with default selection strategy
The corresponding tables of data of related information of matching is as described from table.
Preferably, the selecting module specifically for:
The master meter and the field service condition information from table are obtained according to the metadata respectively;
The field is chosen according to the field service condition information;
Wherein, the field service condition information at least includes:Field Inquiry number of times, filter condition number of times,
Degree of incidence, aggregate statistics number of times, null value accounting, enumerated value accounting.
Preferably, also including processing module, wherein:
When the object table is affairs type fact table, the processing module is used according to the downstream
Information carries out mark to the business procedure of the master meter, it is determined that generation single event fact table or multiple affair are true
Table;
When the object table is accumulation snapshot fact table, the processing module is according to the affairs type thing
Real table carries out mark to the business procedure of the master meter, and will be currently used in other things of the data modeling
The business procedure of real table carries out mark;
When the object table is the dimension table and the fractionation mode is that the level splits, the treatment
The master meter level is split as multiple dimension tables by module according to the field service condition information of the master meter;
When the object table is the dimension table and the fractionation mode is the vertical fractionation, according to described
Business change is higher than predetermined threshold value by master meter and each related information between table, the processing module
Core dimension table is generated by the data modeling from table and the master meter, and business change is not higher than
Predetermined threshold value from table generates self-defined dimension table by the data modeling.
As can be seen here, by the technical scheme of application the application, determine according to the metadata of each source table
Master meter for data modeling and the target by data modeling generation is determined according to the business implication of master meter
After the type of table, the metadata according to master meter is determined for data modeling from table, and from master meter and from
The field for data modeling is selected in table, data modeling is carried out finally according to master meter, from table and field,
Generate the object table.Metadata so as to be based on tables of data carries out data modeling exactly, it is ensured that
The accuracy and efficiency of data modeling result.
Brief description of the drawings
Fig. 1 is a kind of schematic flow sheet of Data Modeling Method that the application is proposed;
Fig. 2 is the relation schematic diagram of source table and object table in the application specific embodiment;
Fig. 3 is main modular schematic diagram in the application specific embodiment;
Fig. 4 is the structural representation of metadata processing module in the application specific embodiment;
Fig. 5 is to carry out the schematic flow sheet of data modeling in the application specific embodiment;
Fig. 6 is a kind of structural representation of data modeling equipment that the application is proposed.
Specific embodiment
In existing data warehouse modeling field, data warehouse model design is main to be included " the of Inmon
Three normal forms are modeled " and " dimensionality analysis of Kimball " the two schools.Both modellings are theoretical
Output result with method for the modelling of certain specific data warehouse can finally be schemed by ER
Or the mode of class ER figures represents, to additionally, there may be DMDWDesigner should be used for data warehouse modeling
Standalone tool product.These technologies can determine master meter and perform data modeling in the case of the table
Process, but as described in background, these modeling patterns do not fit within the reason of data warehouse modeling
By and method, and there is no the guidance (the mainly mode of empirical modeling) of digitization in modeling process,
Result in the inaccurate of data modeling result.
In view of above technical problem, present applicant proposes a kind of Data Modeling Method, as shown in figure 1, should
Method comprises the following steps:
S101, the metadata according to each source table determines the master meter for data modeling.
Master-salve table is a kind of data relationship model, and master meter is the form set up in database, wherein existing
Major key (primary key) is used to be associated with other tables, and as the unique identification in master meter.
It is then the table of with the major key of master meter (primary key) value as external key (Foreign Key), Ke Yitong from table
Cross external key and be associated inquiry with master meter.Inquiry is associated by external key from table and master meter.Wherein, from
Table data dependence in master meter, during general last inquiry data master meter be associated inquiry from table.Master meter
Can be used to store main information, such as customer data (customer number, customer name, client company, client
Unit etc.), it is used for storing client extensions information (customer order information, customer address information, visitor from table
Family contact information etc.).
In the technical scheme of the application, due to needing for multiple tables of data by way of data modeling
Generation object table, therefore firstly the need of the selection master in multiple current tables of data (being also called source table)
Table.For example, object table is probably derived from 1~m tables, that is, table 1 of originating, source table 2 ...,
Source table m, it is assumed that source table 1 is the master meter that we select, other source tables (source table 2 ...,
Source table m) be we select from table, in general, during data modeling involved master meter is only
There is one, detailed process is as shown in Figure 2.
S102, the business implication according to the master meter determines the object table generated by the data modeling
Type.
This step is used to determine the type of object table, and from by and large, the type of object table is included the fact that
Table and the class of dimension table two.Its features is as follows:
(1) true table
Each data warehouse includes one or more fact table.Fact table may include industry
Business sales data, the data as produced by cash registration affairs, fact table generally comprises substantial amounts of row.
Being mainly characterized by of fact table can converge comprising numerical data (fact), and these digital informations
Always, to provide data of the units concerned as history, each fact table is comprising one by some
The index of composition, major key of the index comprising the correlation dimension table as external key.Fact table is not wrapped
Containing descriptive information, also not comprising except digital metric field and making the phase of true and respective items in dimension table
Close any data outside index field." metric " being included in fact table has two kinds:One
It can be accumulative metric to plant, and another kind is non-accumulative metric.Most useful metric is to tire out
The metric of meter, its numeral for adding up is significantly.User can be by cumulative metricses value
Summary information is obtained, for example, one group of sale feelings of the particular commodity in shop in the specific time period can be collected
Condition.Non- accumulative metric can be used for fact table, for example, surveyed in a diverse location for mansion
During amount temperature, if it is nonsensical that the temperature of all diverse locations in mansion is added up, but ask flat
Average is meaningful.One fact table will be associated with one or more dimension tables, Yong Hu
When creating cube using fact table, it is possible to use one or more dimension tables.
(2) dimension table
Dimension table can be regarded as the window that user carrys out analyze data, comprising in fact table in dimension table
The characteristic of fact record, some characteristics provide descriptive information, and some characteristics specify how to collect true number
According to table data, to provide useful information for analyst, dimension table includes the characteristic for helping combined data
Hierarchical structure.For example comprising product information dimension table generally comprise by product be divided into food, beverage,
If the hierarchical structure of the Ganlei such as non-consumption product, each class in these products is further repeatedly segmented, until
Each product reaches lowest level.
In dimension table, each table includes the characteristic of the fact that independently of other dimension tables, for example, client
Dimension table includes the data about client.Information can be divided into different levels by the row field in dimension table
Structural level.
Based on foregoing description, more careful division is if desired carried out in this step, then in object table
In the case of being true table, can also further determine that things type the fact table, periodic snapshot the fact table or
Accumulation snapshot fact table;If object table is dimension table, can further determine that to be that level splits into arranged side by side more
Zhang Weibiao, or core and self-defined dimension table vertically are split into, still do not split.In the excellent of the application
Select in embodiment, processing mode is as follows:
(1) if the business implication according to the master meter determines that the type of the object table is the true table,
Metadata according to the master meter determines the particular type of the true table, and the particular type includes:Thing
Business type fact table, periodic snapshot fact table and accumulation snapshot fact table;
(2) if the business implication according to the master meter determines that the type of the object table is the dimension table,
Metadata according to the master meter determines whether the dimension table needs to be split and fractionation mode, described
Fractionation mode includes:Level splits and vertical fractionation.
S103, the metadata according to the master meter is determined for data modeling from table.
Because the purpose of data modeling is that the related tables of data of tool is polymerized and is associated, therefore
In the preferred embodiment of the application, the downstream use information that will be based primarily upon in metadata is carried out from table
Selection, specifically, obtains according to the downstream use information have related data with the master meter first
Table, then obtains the related information between the master meter and each tables of data, and will be with default selection
The corresponding tables of data of related information of strategy matching is as described from table.
S104, from the master meter and the field selected from table for data modeling.
Based on selected from table in S103, the information of the step each field by master meter and from table is selected
Select the field needed for data modeling.In the preferred embodiment of the application, first according to first number
According to the master meter and the field service condition information from table is obtained respectively, then further according to the word
Section service condition information chooses the field.
Based on the factor that may need to consider to use in data modeling, field service condition information should at least be wrapped
Include:Field Inquiry number of times, filter condition number of times, degree of incidence, aggregate statistics number of times, null value accounting,
Enumerated value accounting.Technical staff can further be expanded on this basis, and these belong to the application
Protection domain.
S105, according to the master meter, it is described carry out data modeling from table and the field, generation is described
Object table.
As described above, object table is that true table or dimension table are general according to the determination of the business implication of master meter,
As an example it is assumed that the business of master meter is meant that (such as someone occurs certain event in certain time in certain place
What), then object table is generally true table;Business such as master meter is meant that certain entity (for example
Commodity, buyer etc.), then object table is generally dimension table..Correspondingly, determine object table be true table it
Afterwards, then it needs to be determined that object table is things type fact table, periodic snapshot fact table or accumulation snapshot thing
Real table.And after it is determined that object table is dimension table, then it is necessary to determine whether to split master meter;It is
Level splits into multiple dimension tables arranged side by side, or vertically splits into core and self-defined dimension table, does not still do
Split.
Based on above-mentioned situation, before finally the step is carried out, the preferred embodiment of the application is for difference
Situation propose corresponding processing mode, it is specific as follows:
(1) when the object table is affairs type fact table, according to the downstream use information pair
The business procedure of the master meter carries out mark, it is determined that generation single event fact table or multiple affair fact table;
(2) when the object table is accumulation snapshot fact table, according to affairs type fact table pair
The business procedure of the master meter carries out mark, and other true tables of the data modeling will be currently used in
Business procedure carries out mark.
In the specific embodiment of the application, for affairs type fact table, it is necessary to the business procedure of master meter
Mark, it is determined that generation single event fact table or multiple affair fact table.Wherein business procedure is typically all source system
The natural business activity of system, such as conclude the business, and can typically place an order, pays, hands over by following business procedure
It is readily accomplished.The metadata of the general foundation of business procedure mark is field downstream service condition, mainly word
The filter condition number of times of section;Business procedure field is typically all time field, as filtering when downstream uses
Condition is more, then the business procedure that the application is primarily upon.
For periodic snapshot fact table, it is necessary to business procedure mark to master meter, the industry of sign this time modeling
After business process, into the treatment of next step.
For accumulation snapshot fact table, be first according to affairs type fact table carries out business procedure mark to master meter;
Then other involved true tables of this modeling are introduced, also according to metadata to introduce other are true
Table mark;To after the completion of the business procedure mark that is related to, into next step modeling.
(3) when the object table is the dimension table and the fractionation mode is that the level splits, root
The master meter level is split as multiple dimension tables according to the field service condition information of the master meter;
(4) when the object table is the dimension table and the fractionation mode is the vertical fractionation, root
According to the master meter and each related information between table, by business change higher than predetermined threshold value from table
Core dimension table is generated by the data modeling with the master meter, and threshold is not higher than preset into business change
What is be worth generates self-defined dimension table from table by the data modeling.
The metadata that level splits general foundation is field downstream service condition, the mainly filtering rod of field
Piece number.As multiple BU shares commodity list, when different BU are used, all BU fields are filtered, only
It is related to the commodity of oneself BU, therefore this specific embodiment does level and splits according to BU, each BU splits
To a dimension table.
The metadata for vertically splitting general foundation is master meter, association situation situation and the product of master-salve table from table
Go out the time;The situation of change of business can be considered simultaneously, modeled what business often changed to self-defined from table
Dimension table, reduces the frequent change of target core dimension table.For example, according to metadata, being associated table 1, being closed
Connection table 2, associated table 3 and master meter degree of incidence are more than certain threshold value, the specific embodiment by this three tables and
Master meter is put into object table together;But master meter, associated table 1 are in 1:00 AM output, and associated table 2, quilt
Contingency table 3 uses the downstream of master meter and associated table 1 to use data as early as possible in 3:00 AM output in order to allow,
Master meter and the modeling of associated table 1 are obtained core dimension table by this specific embodiment, by master meter, the associated and of table 2
The associated modeling of table 3 obtains self-defined dimension table.
After the preparation before completing data modeling by above-mentioned steps is processed, you can by existing modeling
Instrument carries out data modeling and generates object table.Such as external ERWin, ER/Studio,
PowerDesigner be all can be used for operation system (OLTP) or analytic type system (OLAP system,
Data warehouse is OLAP system) ER plan design tools, and the country DMDWDesigner data
Warehouse modeling tool etc., on the premise of it can complete object table generation purpose, specific data modeling work
The difference of tool has no effect on the protection domain of the application.
In order to the technological thought of the application is expanded on further, in conjunction with specifically should shown in Fig. 3 and Fig. 4
With scene, the technical scheme to the application is illustrated.Fig. 3 is the main body in the application specific embodiment
Module, including metadata processing module and model construction module, Fig. 4 be Fig. 3 in metadata add
The further division of work module.
Wherein, the downstream service condition metadata of table mainly includes that inquiry times, scheduling system queries are secondary
Number, Join number, dispatch Join number of system, aggregation number, day net aggregation number, direct downstream number,
Whole downstream numbers etc..It is as shown in table 1 below:
Sequence number | Entry name | Table name | Inquiry times | Its net inquiry times | JOIN number | Direct downstream number | Whole downstream numbers |
1 | A | A1 | 835.3 | 430.9 | 176 | 557 | 121496 |
2 | B | B1 | 343.7 | 160.4 | 127 | 290 | 70501 |
3 | C | C1 | 797.4 2 | 12.7 | 126 | 234 | 117312 |
4 | D | D1 | 229.2 | 155.2 | 114 | 206 | 160743 |
5 | E | E1 | 113.2 | 61.7 | 65 | 93 | 144155 |
Table 1
The Join relationship metadatas of table mainly include, Join master meters, Join are from table, Join types, Join
Number of times, Join logics etc., shown in table specific as follows 2:
Sequence number | Entry name | Associated table | Chinese name | Degree of incidence | Correlation logic |
1 | F | F1 | Table 1 | 0 | Current master meter |
2 | G | G1 | Table 2 | 14 | Xx.url_item=t1.item_id |
3 | H | H1 | Table 3 | 6 | Xx.url=t2.id |
4 | I | I1 | Table 4 | 4 | Xx.cookieuid=t3.user_id |
5 J | J1 | Table 5 | 4 | Xx.visitor_id=t4.inf_user_id |
Table 2
The field downstream service condition metadata of table mainly includes, the where that the field of table is used by downstream
Number of times, select number, join number, by number of group and the corresponding number of times in scheduling system etc.,
It is as shown in table 3 below:
Table 3
Based on above-mentioned metadata table, the idiographic flow schematic diagram of the specific embodiment is as shown in figure 5, main
Comprise the following steps:
Step a) selects master meter:Metadata is may be referred to, selection does not have intermediate layer table, but downstream uses
More ods layers of table of situation.
Step b) determines object table:For true table, by business procedure mark, it is determined that generation single event
True table or multiple affair fact table;For dimension table, it is determined whether carry out level fractionation or vertical fractionation.
Step c) is selected from table:By metadata show master meter downstream service condition, such as master meter and which
A little tables have done association, degree of incidence, association type etc.;This sentences and is more than certain threshold value according to degree of incidence
Selection is illustrated from table.
Step d) selects master meter and the field from table:Field by metadata displaying master meter and from table is used
Situation and dataprofile.Such as Field Inquiry number of times, filter condition number of times, join number, aggregate statistics time
Number, null value accounting, enumerated value accounting etc..The specific embodiment chooses field by these data-guidings.
Step e) generates object module:Object module mainly includes two parts, and Part I is object module
ER figures, i.e. object module obtains by the association of which table and taken which field of these tables;Another portion
Point it is another displaying of model, i.e. model mapping relations, including target table name and annotation, field name
Claim and type, source table, the field and type, conversion logic of table of originating etc..
By the scheme of application above-described embodiment, the theory and method of data warehouse modeling are merged, while
By the way of metadata driven, modeling data is modeled by way of digitization is instructed,
The degree of accuracy and the efficiency of modeling are provided.
To reach above technical purpose, the application also proposed a kind of data modeling equipment, as shown in fig. 6,
Including:
First determining module 610, the metadata according to each source table determines the master meter for data modeling;
Second determining module 620, the business implication according to the master meter determines to be generated by the data modeling
Object table type;
3rd determining module 630, the metadata according to the master meter is determined for data modeling from table;
Selecting module 640, from the master meter and the field selected from table for data modeling;
MBM 650, according to the master meter, it is described carry out data modeling from table and the field, it is raw
Into the object table.
In specific application scenarios, second determining module specifically for:
If the business implication according to the master meter determines that the type of the object table is the true table, according to
The metadata of the master meter determines the particular type of the true table, and the particular type includes:Affairs type
True table, periodic snapshot fact table and accumulation snapshot fact table;
If the business implication according to the master meter determines that the type of the object table is the dimension table, according to institute
The metadata for stating master meter determines whether the dimension table needs to be split and fractionation mode, the fractionation side
Formula includes:Level splits and vertical fractionation.
In specific application scenarios, the metadata includes downstream use information, and the described 3rd determines mould
Block specifically for:
Obtained according to the downstream use information and have related tables of data with the master meter;
The related information between the master meter and each tables of data is obtained, and will be with default selection strategy
The corresponding tables of data of related information of matching is as described from table.
In specific application scenarios, the selecting module specifically for:
The master meter and the field service condition information from table are obtained according to the metadata respectively;
The field is chosen according to the field service condition information;
Wherein, the field service condition information at least includes:Field Inquiry number of times, filter condition number of times,
Degree of incidence, aggregate statistics number of times, null value accounting, enumerated value accounting.
In specific application scenarios, also including processing module, wherein:
When the object table is affairs type fact table, the processing module is used according to the downstream
Information carries out mark to the business procedure of the master meter, it is determined that generation single event fact table or multiple affair are true
Table;
When the object table is accumulation snapshot fact table, the processing module is according to the affairs type thing
Real table carries out mark to the business procedure of the master meter, and will be currently used in other things of the data modeling
The business procedure of real table carries out mark;
When the object table is the dimension table and the fractionation mode is that the level splits, the treatment
The master meter level is split as multiple dimension tables by module according to the field service condition information of the master meter;
When the object table is the dimension table and the fractionation mode is the vertical fractionation, according to described
Business change is higher than predetermined threshold value by master meter and each related information between table, the processing module
Core dimension table is generated by the data modeling from table and the master meter, and business change is not higher than
Predetermined threshold value from table generates self-defined dimension table by the data modeling.
By the technical scheme of application the application, determine to be built for data according to the metadata of each source table
The type of the master meter of mould and the object table for determining to be generated by data modeling according to the business implication of master meter
Afterwards, the metadata according to master meter is determined for data modeling from table, and is selected from master meter and from table
For the field of data modeling, data modeling is carried out finally according to master meter, from table and field, generate institute
State object table.Metadata so as to be based on tables of data carries out data modeling exactly, it is ensured that data
The accuracy and efficiency of modeling result.
Through the above description of the embodiments, those skilled in the art can be understood that this Shen
Please be realized by hardware, it is also possible to realized by the mode of software plus necessary general hardware platform.
Based on such understanding, the technical scheme of the application can be embodied in the form of software product, and this is soft
It (can be CD-ROM, USB flash disk is mobile hard that part product can be stored in a non-volatile memory medium
Disk etc.) in, including some instructions are used to so that a computer equipment (can be personal computer, take
Business device, or the network equipment etc.) perform method described in the application each implement scene.
It will be appreciated by those skilled in the art that accompanying drawing is a schematic diagram for being preferable to carry out scene, in accompanying drawing
Module or necessary to flow not necessarily implements the application.
It will be appreciated by those skilled in the art that the module in device in implement scene can be according to implement scene
Description be distributed in the device of implement scene, it is also possible to is carried out respective change and is disposed other than this implementation
In one or more devices of scene.The module of above-mentioned implement scene can merge into a module, also may be used
To be further split into multiple submodule.
Above-mentioned the application sequence number is for illustration only, and the quality of implement scene is not represented.
Disclosed above is only several specific implementation scenes of the application, but, the application is not limited to
This, the changes that any person skilled in the art can think of should all fall into the protection domain of the application.
Claims (10)
1. a kind of Data Modeling Method, it is characterised in that including:
Metadata according to each source table determines the master meter for data modeling;
Business implication according to the master meter determines the type of the object table generated by the data modeling;
Metadata according to the master meter is determined for data modeling from table;
From the master meter and the field selected from table for data modeling;
According to the master meter, it is described carry out data modeling from table and the field, generate the object table.
2. the method for claim 1, it is characterised in that the business implication according to the master meter is true
The type of the object table for being generated by the data modeling surely, specially:
If the business implication according to the master meter determines that the type of the object table is the true table, according to
The metadata of the master meter determines the particular type of the true table, and the particular type includes:Affairs type
True table, periodic snapshot fact table and accumulation snapshot fact table;
If the business implication according to the master meter determines that the type of the object table is the dimension table, according to institute
The metadata for stating master meter determines whether the dimension table needs to be split and fractionation mode, the fractionation side
Formula includes:Level splits and vertical fractionation.
3. method as claimed in claim 2, it is characterised in that the metadata includes that downstream uses letter
Breath, the metadata according to the master meter determined for data modeling from table, specially:
Obtained according to the downstream use information and have related tables of data with the master meter;
The related information between the master meter and each tables of data is obtained, and will be with default selection strategy
The corresponding tables of data of related information of matching is as described from table.
4. the method for claim 1, it is characterised in that from the master meter and described from table
The field for data modeling is selected, specially:
The master meter and the field service condition information from table are obtained according to the metadata respectively;
The field is chosen according to the field service condition information;
Wherein, the field service condition information at least includes:Field Inquiry number of times, filter condition number of times,
Degree of incidence, aggregate statistics number of times, null value accounting, enumerated value accounting.
5. the method as described in any one of claim 3 or 4, it is characterised in that according to the master meter,
It is described carry out data modeling from table and the field before, also include:
When the object table is affairs type fact table, according to the downstream use information to the master
The business procedure of table carries out mark, it is determined that generation single event fact table or multiple affair fact table;
When the object table is accumulation snapshot fact table, according to affairs type fact table to the master
The business procedure of table carries out mark, and will be currently used in the business mistake of other true tables of the data modeling
Cheng Jinhang marks;
When the object table is the dimension table and the fractionation mode is that the level splits, according to described
The master meter level is split as multiple dimension tables by the field service condition information of master meter;
When the object table is the dimension table and the fractionation mode is the vertical fractionation, according to described
Master meter and each related information between table, by business change higher than predetermined threshold value from table with it is described
Master meter by the data modeling generate core dimension table, and by business change be not higher than predetermined threshold value from
Table generates self-defined dimension table by the data modeling.
6. a kind of data modeling equipment, it is characterised in that including:
First determining module, the metadata according to each source table determines the master meter for data modeling;
Second determining module, the business implication according to the master meter determines what is generated by the data modeling
The type of object table;
3rd determining module, the metadata according to the master meter is determined for data modeling from table;
Selecting module, from the master meter and the field selected from table for data modeling;
MBM, according to the master meter, it is described carry out data modeling from table and the field, generate
The object table.
7. equipment as claimed in claim 6, it is characterised in that second determining module specifically for:
If the business implication according to the master meter determines that the type of the object table is the true table, according to
The metadata of the master meter determines the particular type of the true table, and the particular type includes:Affairs type
True table, periodic snapshot fact table and accumulation snapshot fact table;
If the business implication according to the master meter determines that the type of the object table is the dimension table, according to institute
The metadata for stating master meter determines whether the dimension table needs to be split and fractionation mode, the fractionation side
Formula includes:Level splits and vertical fractionation.
8. equipment as claimed in claim 7, it is characterised in that the metadata includes that downstream uses letter
Breath, the 3rd determining module specifically for:
Obtained according to the downstream use information and have related tables of data with the master meter;
The related information between the master meter and each tables of data is obtained, and will be with default selection strategy
The corresponding tables of data of related information of matching is as described from table.
9. equipment as claimed in claim 6, it is characterised in that the selecting module specifically for:
The master meter and the field service condition information from table are obtained according to the metadata respectively;
The field is chosen according to the field service condition information;
Wherein, the field service condition information at least includes:Field Inquiry number of times, filter condition number of times,
Degree of incidence, aggregate statistics number of times, null value accounting, enumerated value accounting.
10. the equipment as described in any one of claim 6 or 9, it is characterised in that also including treatment mould
Block, wherein:
When the object table is affairs type fact table, the processing module is used according to the downstream
Information carries out mark to the business procedure of the master meter, it is determined that generation single event fact table or multiple affair are true
Table;
When the object table is accumulation snapshot fact table, the processing module is according to the affairs type thing
Real table carries out mark to the business procedure of the master meter, and will be currently used in other things of the data modeling
The business procedure of real table carries out mark;
When the object table is the dimension table and the fractionation mode is that the level splits, the treatment
The master meter level is split as multiple dimension tables by module according to the field service condition information of the master meter;
When the object table is the dimension table and the fractionation mode is the vertical fractionation, according to described
Business change is higher than predetermined threshold value by master meter and each related information between table, the processing module
Core dimension table is generated by the data modeling from table and the master meter, and business change is not higher than
Predetermined threshold value from table generates self-defined dimension table by the data modeling.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510980569.3A CN106909566A (en) | 2015-12-23 | 2015-12-23 | A kind of Data Modeling Method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510980569.3A CN106909566A (en) | 2015-12-23 | 2015-12-23 | A kind of Data Modeling Method and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106909566A true CN106909566A (en) | 2017-06-30 |
Family
ID=59200081
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510980569.3A Pending CN106909566A (en) | 2015-12-23 | 2015-12-23 | A kind of Data Modeling Method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106909566A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107643917A (en) * | 2017-10-19 | 2018-01-30 | 山东浪潮通软信息科技有限公司 | A kind of user configuration information management method and device |
CN108170557A (en) * | 2018-01-24 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | For the method and apparatus of output information |
CN108763565A (en) * | 2018-06-04 | 2018-11-06 | 广东京信软件科技有限公司 | A kind of matched construction method of data auto-associating based on deep learning |
CN109377159A (en) * | 2018-09-19 | 2019-02-22 | 成都信息工程大学 | A kind of software modeling procedure incarnation evolution system and method, processor, terminal |
CN110175173A (en) * | 2019-05-24 | 2019-08-27 | 全知科技(杭州)有限责任公司 | A kind of identification of operation system master data and differentiating method based on data characteristics analysis |
CN110222032A (en) * | 2019-05-22 | 2019-09-10 | 武汉掌游科技有限公司 | A kind of generalised event model based on software data analysis |
CN110674117A (en) * | 2019-09-26 | 2020-01-10 | 京东数字科技控股有限公司 | Data modeling method and device, computer readable medium and electronic equipment |
CN111191177A (en) * | 2019-12-25 | 2020-05-22 | 苏宁金融科技(南京)有限公司 | Web-based model construction method and device, computer equipment and storage medium |
CN111666347A (en) * | 2019-03-07 | 2020-09-15 | 阿里巴巴集团控股有限公司 | Data processing method, device and equipment |
CN111831624A (en) * | 2020-07-14 | 2020-10-27 | 北京三快在线科技有限公司 | Data table creating method and device, computer equipment and storage medium |
CN113076314A (en) * | 2021-03-30 | 2021-07-06 | 深圳市酷开网络科技股份有限公司 | Data table storage method and device and computer readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101178732A (en) * | 2007-12-12 | 2008-05-14 | 江苏省电力公司 | Method for quick-speed realizing data store house process based on metadata |
CN101777073A (en) * | 2010-02-01 | 2010-07-14 | 浪潮集团山东通用软件有限公司 | Data conversion method based on XML form |
US8510339B1 (en) * | 2000-10-03 | 2013-08-13 | A9.com | Searching content using a dimensional database |
CN103853820A (en) * | 2014-02-20 | 2014-06-11 | 北京用友政务软件有限公司 | Data processing method and data processing system |
-
2015
- 2015-12-23 CN CN201510980569.3A patent/CN106909566A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8510339B1 (en) * | 2000-10-03 | 2013-08-13 | A9.com | Searching content using a dimensional database |
CN101178732A (en) * | 2007-12-12 | 2008-05-14 | 江苏省电力公司 | Method for quick-speed realizing data store house process based on metadata |
CN101777073A (en) * | 2010-02-01 | 2010-07-14 | 浪潮集团山东通用软件有限公司 | Data conversion method based on XML form |
CN103853820A (en) * | 2014-02-20 | 2014-06-11 | 北京用友政务软件有限公司 | Data processing method and data processing system |
Non-Patent Citations (1)
Title |
---|
戴浩: "基于业务元数据的多维建模系统设计与实现", 《计算机工程与设计》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107643917A (en) * | 2017-10-19 | 2018-01-30 | 山东浪潮通软信息科技有限公司 | A kind of user configuration information management method and device |
CN108170557A (en) * | 2018-01-24 | 2018-06-15 | 百度在线网络技术(北京)有限公司 | For the method and apparatus of output information |
CN108763565A (en) * | 2018-06-04 | 2018-11-06 | 广东京信软件科技有限公司 | A kind of matched construction method of data auto-associating based on deep learning |
CN109377159A (en) * | 2018-09-19 | 2019-02-22 | 成都信息工程大学 | A kind of software modeling procedure incarnation evolution system and method, processor, terminal |
CN111666347B (en) * | 2019-03-07 | 2023-04-07 | 阿里巴巴集团控股有限公司 | Data processing method, device and equipment |
CN111666347A (en) * | 2019-03-07 | 2020-09-15 | 阿里巴巴集团控股有限公司 | Data processing method, device and equipment |
CN110222032A (en) * | 2019-05-22 | 2019-09-10 | 武汉掌游科技有限公司 | A kind of generalised event model based on software data analysis |
CN110175173B (en) * | 2019-05-24 | 2021-03-26 | 全知科技(杭州)有限责任公司 | Service system main data identification and distinguishing method based on data characteristic analysis |
CN110175173A (en) * | 2019-05-24 | 2019-08-27 | 全知科技(杭州)有限责任公司 | A kind of identification of operation system master data and differentiating method based on data characteristics analysis |
CN110674117A (en) * | 2019-09-26 | 2020-01-10 | 京东数字科技控股有限公司 | Data modeling method and device, computer readable medium and electronic equipment |
CN111191177A (en) * | 2019-12-25 | 2020-05-22 | 苏宁金融科技(南京)有限公司 | Web-based model construction method and device, computer equipment and storage medium |
CN111831624A (en) * | 2020-07-14 | 2020-10-27 | 北京三快在线科技有限公司 | Data table creating method and device, computer equipment and storage medium |
CN113076314A (en) * | 2021-03-30 | 2021-07-06 | 深圳市酷开网络科技股份有限公司 | Data table storage method and device and computer readable storage medium |
CN113076314B (en) * | 2021-03-30 | 2024-04-19 | 深圳市酷开网络科技股份有限公司 | Data table storage method and device and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106909566A (en) | A kind of Data Modeling Method and equipment | |
Ghazal et al. | Bigbench: Towards an industry standard benchmark for big data analytics | |
CN100568237C (en) | Report form template in the multidimensional enterprise software system generates method and system | |
Jukić et al. | Augmenting data warehouses with big data | |
CN101111838B (en) | Automated relational schema generation within a multidimensional enterprise software system | |
US7840896B2 (en) | Definition and instantiation of metric based business logic reports | |
JP6846356B2 (en) | Systems and methods for automatically inferring the cube schema used in a multidimensional database environment from tabular data | |
US8217945B1 (en) | Social annotation of a single evolving visual representation of a changing dataset | |
US20100131457A1 (en) | Flattening multi-dimensional data sets into de-normalized form | |
CN105045869B (en) | Natural resources geographical spatial data method for organizing based on multiple data centers and system | |
CN107016001A (en) | A kind of data query method and device | |
US20070143161A1 (en) | Application independent rendering of scorecard metrics | |
CN102541867A (en) | Data dictionary generating method and system | |
CN108108477B (en) | A kind of the KPI system and Rights Management System of linkage | |
CN104598449A (en) | Preference-based clustering | |
Kim et al. | Simultaneous edit-imputation and disclosure limitation for business establishment data | |
JP6375029B2 (en) | A metadata-based online analytical processing system that analyzes the importance of reports | |
Ramadhani et al. | Implementation of data warehouse in making business intelligence dashboard development using PostgreSQL database and Kimball lifecycle method | |
Yu | Data mining in library reader management | |
Wijayanti et al. | K-means cluster analysis for students graduation: case study: STMIK Widya Cipta Dharma | |
Hamoud et al. | Design and implementing cancer data warehouse to support clinical decisions | |
Herschel | Principles and Applications of Business Intelligence Research | |
CN116090880A (en) | Data index system modeling method and system based on big data CDP system | |
Walde et al. | Performance contest between MLE and GMM for huge spatial autoregressive models | |
Renfro | Economic database systems: further reflections on the state of the art |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170630 |
|
RJ01 | Rejection of invention patent application after publication |