CN103902653B - A kind of method and apparatus for building data warehouse table genetic connection figure - Google Patents

A kind of method and apparatus for building data warehouse table genetic connection figure Download PDF

Info

Publication number
CN103902653B
CN103902653B CN201410072773.0A CN201410072773A CN103902653B CN 103902653 B CN103902653 B CN 103902653B CN 201410072773 A CN201410072773 A CN 201410072773A CN 103902653 B CN103902653 B CN 103902653B
Authority
CN
China
Prior art keywords
data warehouse
sentence
name
operations
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410072773.0A
Other languages
Chinese (zh)
Other versions
CN103902653A (en
Inventor
陈武
刘超洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd
Original Assignee
ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd filed Critical ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd
Priority to CN201410072773.0A priority Critical patent/CN103902653B/en
Publication of CN103902653A publication Critical patent/CN103902653A/en
Application granted granted Critical
Publication of CN103902653B publication Critical patent/CN103902653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method and apparatus for building data warehouse table genetic connection figure, belong to computer realm.This method includes:Parsing accesses each data warehouse operations sentence of data warehouse, obtains the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed;The sentence mark and the corresponding relation of the table name of the data warehouse purpose table accessed of each data warehouse operations sentence are stored in mapping table;According to mapping table, the table name of the corresponding data warehouse source table of each data warehouse purpose table in mapping table is obtained;According to the table name of each data warehouse purpose table and the table name of the corresponding data warehouse source table of each data warehouse purpose table, data warehouse table genetic connection figure is built.The device includes:Parsing module, the first memory module, the first acquisition module and structure module.Server can build data warehouse genetic connection figure automatically in the present invention.

Description

A kind of method and apparatus for building data warehouse table genetic connection figure
Technical field
The present invention relates to computer realm, more particularly to a kind of method and dress for building data warehouse table genetic connection figure Put.
Background technology
Be stored with various business datums in data warehouse, and different business datums is stored in different traffic tables In.Therefore, be stored with multiple traffic tables in data warehouse, how the multiple traffic tables stored in data warehouse is built into data The problem of warehouse table genetic connection figure is in the urgent need to address.
At present, all it is that data warehouse management personnel parse data warehouse operations sentence and build data warehouse table genetic connection Figure.And during data warehouse management personnel structure data warehouse table genetic connection figure, easily error;Also, the industry in data warehouse Data volume of being engaged in is very big, causes the workload of data warehouse management personnel big.
The content of the invention
In order to solve problem of the prior art, the invention provides a kind of method for building data warehouse table genetic connection figure And device.The technical scheme is as follows:
On the one hand there is provided a kind of method for building data warehouse table genetic connection figure, methods described includes:
Parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse operations sentence and visits The table name for the data warehouse purpose table asked;
By the sentence mark and pair of the table name of the data warehouse purpose table of access of each data warehouse operations sentence It should be related to and be stored in mapping table;
According to the mapping table, the corresponding data of each data warehouse purpose table in the mapping table are obtained The table name of warehouse source table;
According to the table name of each data warehouse purpose table and the corresponding data warehouse of each data warehouse purpose table come The table name of source table, builds data warehouse table genetic connection figure.
Further, the parsing accesses each data warehouse operations sentence of data warehouse, obtains each data The table name for the data warehouse purpose table that warehouse operation sentence is accessed, including:
The parsing each data warehouse operations sentence for accessing data warehouse, obtains each of the access data warehouse The corresponding access mode of data warehouse operations sentence;
Obtain the data warehouse operations sentence that access mode is WriteMode;
The data warehouse operations sentence that the access mode is WriteMode is parsed, it is WriteMode to obtain the access mode The table name of all data warehouse purpose tables accessed.
Further, the parsing accesses each data warehouse operations sentence of data warehouse, obtains each data After the table name for the data warehouse purpose table that warehouse operation sentence is accessed, methods described also includes:
Obtain data warehouse operations sentence and corresponding lead-in path that task type is lead-in type;
Task type is obtained according to the lead-in path to be analysis type and there is the data warehouse of the lead-in path to grasp Make sentence;
It is analysis type to bind data warehouse operations sentence and the task type that the task type is lead-in type And the data warehouse operations sentence with the lead-in path.
Further, it is described according to the mapping table, obtain each data warehouse mesh in the mapping table The corresponding data warehouse of table originate the table name of table, including:
For every record in the mapping table, the data warehouse operations sentence stored in the acquisition record Sentence identifies the table name with data warehouse purpose table;
Data warehouse operations sentence is obtained according to the sentence of acquisition mark;
The data warehouse operations sentence of the acquisition is parsed, the corresponding data bins of each data warehouse purpose table are obtained The table name of storehouse source table.
Further, the table name of each data warehouse purpose table of the basis is corresponding with each data warehouse purpose table Data warehouse originate table table name, build data warehouse table genetic connection figure, including:
In data warehouse table genetic connection figure, the corresponding node of table name of the data warehouse purpose table, and structure are built Build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
It regard the corresponding node of table name of the data warehouse purpose table as the corresponding data of the data warehouse purpose table The child node of the corresponding node of table name of warehouse source table.
Further, after the corresponding node of table name for building the data warehouse purpose table, methods described is also wrapped Include:
The data warehouse operations sentence for accessing the data warehouse purpose table is stored in the data warehouse purpose table In the corresponding node of table name;
The data warehouse table genetic connection figure is sent to terminal, user is shown to by the terminal.
On the other hand, the invention provides a kind of device for building data warehouse table genetic connection figure, described device includes:
Parsing module, each data warehouse operations sentence of data warehouse is accessed for parsing, each data are obtained The table name for the data warehouse purpose table that warehouse operation sentence is accessed;
First memory module, for the sentence of each data warehouse operations sentence to be identified to the data warehouse with accessing The corresponding relation of the table name of purpose table is stored in mapping table;
First acquisition module, for according to the mapping table, obtaining each data bins in the mapping table The table name of the corresponding data warehouse source table of storehouse purpose table;
Module is built, it is corresponding with each data warehouse purpose table for the table name according to each data warehouse purpose table Data warehouse originate table table name, build data warehouse table genetic connection figure.
Further, the parsing module, including:
First resolution unit, each data warehouse operations sentence for parsing the access data warehouse, obtains described Access the corresponding access mode of each data warehouse operations sentence of data warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that the access mode is WriteMode, obtains described Access mode is the table name of all data warehouse purpose tables of the access of WriteMode.
Further, described device also includes:
Second acquisition module, is the data warehouse operations sentence of lead-in type and corresponding for obtaining task type Lead-in path;
3rd acquisition module, for obtaining task type for analysis type according to the lead-in path and being imported with described The data warehouse operations sentence in path;
Binding module, for binding the data warehouse operations sentence and the task class that the task type is lead-in type Type is analysis type and the data warehouse operations sentence with the lead-in path.
Further, first acquisition module, including:
First acquisition unit, for for every record in the mapping table, obtaining what is stored in the record The sentence mark and the table name of data warehouse purpose table of data warehouse operations sentence;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, the data warehouse operations sentence for parsing the acquisition, obtains each data warehouse The table name of the corresponding data warehouse source table of purpose table.
Further, the structure module, including:
Construction unit, in data warehouse table genetic connection figure, building the table name pair of the data warehouse purpose table The node answered, and build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
As unit, for regarding the corresponding node of table name of the data warehouse purpose table as the data warehouse purpose The child node of the corresponding node of table name of the corresponding data warehouse source table of table.
Further, described device also includes:
Second memory module, it is described for the data warehouse operations sentence for accessing the data warehouse purpose table to be stored in In the corresponding node of table name of data warehouse purpose table;
Sending module, for the data warehouse table genetic connection figure to be sent into terminal, use is shown to by the terminal Family.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure The speed of data warehouse table genetic connection and the degree of accuracy.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, makes required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.
Fig. 1 is a kind of method flow diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 1 is provided;
Fig. 2 is a kind of method flow diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 2 is provided;
Fig. 3 is a kind of apparatus structure schematic diagram for structure data warehouse table genetic connection figure that the embodiment of the present invention 3 is provided.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.
Embodiment 1
The embodiments of the invention provide a kind of method for building data warehouse table genetic connection figure.Referring to Fig. 1, wherein, should Method includes:
Step 101:Parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse operations language The table name for the data warehouse purpose table that sentence is accessed;
Step 102:By the sentence mark and the table name of the data warehouse purpose table accessed of each data warehouse operations sentence Corresponding relation be stored in mapping table;
Step 103:According to mapping table, the corresponding data of each data warehouse purpose table in mapping table are obtained The table name of warehouse source table;
Step 104:According to the table name of each data warehouse purpose table and the corresponding data bins of each data warehouse purpose table The table name of storehouse source table, builds data warehouse table genetic connection figure.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure The speed of data warehouse table genetic connection and the degree of accuracy.
Embodiment 2
The embodiments of the invention provide a kind of method for building data warehouse table genetic connection figure.Referring to Fig. 2 wherein, the party Method includes:
Step 201:Server obtains the data warehouse behaviour of the access data warehouse of each service point from data warehouse Make sentence;
Specifically, server obtains each industry for belonging to same type of service according to type of service from data warehouse The data warehouse operations sentence of the access data warehouse of business point, and be each data warehouse operations sentence distribution sentence mark.
Wherein, the corresponding pass of data warehouse operations sentence of the type of service that is stored with server with accessing data warehouse System, can obtain according to type of service from corresponding relation of the type of service with the data warehouse operations sentence of access data warehouse Take the data warehouse operations sentence of the access data warehouse of each service point corresponding with type of service.
All it is the data warehouse operations sentence by accessing data warehouse in data warehouse, wherein it is desired to illustrate Data in data warehouse are operated.Accessing the data warehouse operations sentence of data warehouse can be:
insert overwrite table hive_table_b
Select*
From hive_table_a
……
Step 202:Server parsing accesses each data warehouse operations sentence of data warehouse, obtains each data warehouse The table name for the data warehouse purpose table that action statement is accessed;
Specifically, server parsing accesses each data warehouse operations sentence of data warehouse, obtains accessing data warehouse The corresponding access mode of each data warehouse operations sentence, the access mode includes read mode and WriteMode, and WriteMode includes Write data to data warehouse and write data to local file.Server obtains each data warehouse that access mode is WriteMode Action statement, and pass through each data warehouse operations language of the first matching regular expressions rule parsing access mode for WriteMode Sentence, searches out the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed.
Wherein, the first matching regular expressions rule is as follows:
(1), character " behind insert into ", and "(" before one do not include blank character and the " word of (" Symbol string;
(2), character " behind replace into ", and character "(" before one do not include blank character and "(” Character string;
(3), character " one behind insert overwrite table " do not include blank character character string;
(4), character " one behind overwrite into table " do not include blank character character string.
, wherein it is desired to explanation, above rule is concurrency relation, if any rule is set up, also, in this hair In bright embodiment, all characters all do not differentiate between alphabet size and write.
Wherein, the first regular expression code is:
(?i)insert\\s+into\\s+([^\\s\\(]+)\\s*\\(|(?i)replace\\s+into\\s+ ([^\\s\\(]+)\\s*\\(|(?i)i nsert\\s+overwrite\\s+table\\s+(\\S+)\\s+|(?i) overwrite\\s+into\\s+table\\s+(\\S+)\\s+。
, wherein it is desired to which the various business datums that are stored with explanation, data warehouse, server passes through these The business datum of magnanimity level is loaded onto hive layers of source table of data warehouse, and further according to different business, hive layers of source table are entered The different extraction of row, cleaning and conversion, obtain various hive layer services tables.Server is based on these hive layer services again Table, according to the business needs of more sub-layers, the more sub-layers background carried out many times is split, and obtains more hive layer services tables, side Just different business diagnosis is counted.Also, server can also be according to different business, by the business number in hive layer service tables According to carrying out after analytic statistics, import in mysql layer service tables, do web page display.Some special circumstances, can also be to mysql layers The business datum of traffic table is done further to extract and calculated, and web page display is done with more convenient.
Wherein, above-mentioned matching regular expressions rule step is passed through(1)With(2)Mysql layers of table name is may search for out, is walked Suddenly(3)With(4)It may search for out hive layers of table name.
Step 203:Server parsing accesses each data warehouse operations sentence of data warehouse, obtains accessing data warehouse Each data warehouse operations sentence task type;
Wherein, task type includes:Analysis, importing and extraction.Hive layer services are split and the analysis of hive layer services belongs to Analysis type;Business datum in hive layer service tables imports mysql layer service tables and belongs to lead-in type;Mysql layer service tables Business datum extract calculating belong to extraction type.
Step 204:Server obtains task type for the data warehouse operations sentence of lead-in type and corresponding led Enter path;
Wherein, can data storage warehouse operation sentence and importing road in server if task type is lead-in type The corresponding relation in footpath, the data warehouse operations sentence of lead-in type and corresponding importing road can be obtained according to task type Footpath.
For example, the data warehouse operations sentence that task type is lead-in type is:
Insert into mysql_table_a(key1,col1,col2)
values(:key1,:col1,:col2)
ON DUPLICATE KEY UPDATE
col1=values(col1),
col2=values(col2);
Lead-in path corresponding with above-mentioned data base manipulation statement is "/path1/path2/ ".
Further, server reads lead-in path for " all content of text, are performed under/path1/path2/ " catalogues Following action statement:
Insert into mysql_table_a(key1,col1,col2)
values(:key1,:col1,:col2)
ON DUPLICATE KEY UPDATE
col1=values(col1),
col2=values(col2)。
Step 205:Server obtains task type for analysis type according to the lead-in path and has the lead-in path Data warehouse operations sentence;
Specifically, server obtains the data warehouse operations sentence that task type is analysis type, and from the task of acquisition Type has for the data warehouse operations sentence obtained in the data warehouse operations sentence of analysis type with lead-in type in step 204 There is the data warehouse operations sentence of identical lead-in path.
Wherein, task type is the data warehouse purpose table of the data warehouse operations sentence of lead-in type, can be corresponded at least Two data warehouse operations sentences.One of data warehouse operations sentence is the data bins that current task type is lead-in type Storehouse action statement, data warehouse operations sentence in addition is that have with task type for the data warehouse operations sentence of lead-in type The data warehouse operations sentence of the analysis type of identical lead-in path.
For example, task type is analysis type and has identical lead-in path with the data warehouse operations sentence of lead-in type Data warehouse operations sentence be:
insert overwrite local directory‘/path1/path2/path3/’
select count(1)
from hive_table_a
Step 206:Server binding task type is that the data warehouse operations sentence and task type of lead-in type are to divide Analyse type and the data warehouse operations sentence with the lead-in path;
Wherein, server binding task type is that the data warehouse operations sentence and task type of lead-in type are analysis classes Type and the data warehouse operations sentence with the lead-in path in step 204, so as to set up mysql layers of table name and hive The incidence relation of layer table name.
Step 207:Server identifies the sentence of each data warehouse operations sentence in the data warehouse purpose table with accessing The corresponding relation of table name be stored in mapping table;
Specifically, server is by the data warehouse purpose table of access and task type corresponding with data warehouse purpose table For lead-in type data warehouse operations sentence sentence mark and task type be analysis type and with the data of lead-in type The sentence mark that warehouse operation sentence has the data warehouse operations sentence of identical lead-in path is stored in mapping table.
Wherein, server identifies the sentence of each data warehouse operations sentence in the table of the data warehouse purpose table with accessing The corresponding relation of name is stored in mapping table, and the sentence mark of server based on data warehouse operation sentence can obtain visit Access data warehouse operations sentence where the table name for the data warehouse purpose table asked.
Step 208:Server obtains each data warehouse purpose table correspondence in mapping table according to mapping table Data warehouse originate table table name;
Wherein, step 208 specifically may include steps of(1)Extremely(3):
(1), for every in mapping table record, server obtains the data warehouse operations sentence stored in record Sentence mark and data warehouse purpose table table name;
Specifically, for every record in mapping table, server obtains every note in mapping table successively The sentence mark and the table name of data warehouse purpose table of the data warehouse operations sentence stored in record.
(2), server according to the sentence of acquisition mark obtain data warehouse operations sentence;
Wherein, the sentence that is stored with server identifies the corresponding relation with data warehouse operations sentence, is identified according to sentence It can be obtained and the corresponding data warehouse operations of sentence mark from the corresponding relation of sentence mark and data warehouse operations sentence Sentence.
(3), server parsing obtain data warehouse operations sentence, obtain the corresponding data of each data warehouse purpose table The table name of warehouse source table.
Specifically, the data warehouse operations sentence that server is obtained by the second matching regular expressions rule parsing, is searched The table name for the corresponding source database warehouse table of data warehouse purpose table that the data warehouse operations sentence that rope goes out acquisition includes.
Wherein, the second regular expression code is as follows:
"(?i)\\s+"+table+"(\\s+|$|;)"。
Wherein, in embodiments of the present invention, server judges whether include data in the data warehouse operations sentence obtained The table name of warehouse purpose table, if comprising parsing obtained table name as data bins in the data warehouse operations sentence obtained The table name of the corresponding data warehouse source table of storehouse purpose table.
When the task type of the action statement of data warehouse is analysis type, server only needs to match hive layers of table Name;When the task type of the action statement of data warehouse is lead-in type or extraction type, server only needs to matching Mysql layers of table name.
Further, server can be corresponding with each data warehouse purpose table by the table name of each data warehouse purpose table The table name of data warehouse source table be stored in the second mapping table, can be with according to the content in the second mapping table Build complete database genetic connection figure.
Step 209:Server is corresponding according to the table name and each data warehouse purpose table of each data warehouse purpose table The table name of data warehouse source table, builds data warehouse table genetic connection figure;
Wherein, step 209 specifically may comprise steps of(1)Extremely(2):
(1), server in data warehouse table genetic connection figure, build data warehouse purpose table the corresponding section of table name Point, and build the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
(2), server regard the corresponding node of table name of data warehouse purpose table as the corresponding number of data warehouse purpose table According to the child node of the corresponding node of table name of warehouse source table.
Step 210:The data warehouse operations sentence for accessing data warehouse purpose table is stored in data warehouse mesh by server Table the corresponding node of table name in;
Wherein, in data warehouse table genetic connection figure, server grasps the data warehouse for accessing data warehouse purpose table It is stored in as sentence in the corresponding node of table name of data warehouse purpose table, table name correspondence of the server in data warehouse purpose table Node in can just obtain access data warehouse purpose table data warehouse operations sentence.
Step 211:Data warehouse genetic connection figure is sent to terminal by server;
Step 212:The data warehouse genetic connection figure that terminal the reception server is sent, and by data warehouse genetic connection figure It is shown to user.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure The speed of data warehouse table genetic connection and the degree of accuracy.
Embodiment 3
The embodiments of the invention provide a kind of device for building data warehouse table genetic connection figure.Referring to Fig. 3, wherein, should Device includes:
Parsing module 301, each data warehouse operations sentence of data warehouse is accessed for parsing, each data bins are obtained The table name for the data warehouse purpose table that storehouse action statement is accessed;
First memory module 302, for the sentence of each data warehouse operations sentence to be identified to the data warehouse with accessing The corresponding relation of the table name of purpose table is stored in mapping table;
First acquisition module 303, for according to mapping table, obtaining each data warehouse purpose in mapping table The table name of the corresponding data warehouse source table of table;
Module 304 is built, it is corresponding with each data warehouse purpose table for the table name according to each data warehouse purpose table Data warehouse originate table table name, build data warehouse table genetic connection figure.
Further, parsing module 301, including:
First resolution unit, each data warehouse operations sentence of data warehouse is accessed for parsing, and obtains accessing data The corresponding access mode of each data warehouse operations sentence in warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that access mode is WriteMode, obtains access mode For the table name of all data warehouse purpose tables of the access of WriteMode.
Further, the device also includes:
Second acquisition module, is the data warehouse operations sentence of lead-in type and corresponding for obtaining task type Lead-in path;
3rd acquisition module, for obtaining task type for analysis type and with the lead-in path according to the lead-in path Data warehouse operations sentence;
Binding module, is analysis for data warehouse operations sentence and task type that binding task type is lead-in type Type and the data warehouse operations sentence with lead-in path.
Further, the first acquisition module 303, including:
First acquisition unit, for for every record in mapping table, obtaining the data warehouse stored in record The sentence mark and the table name of data warehouse purpose table of action statement;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, for parsing the data warehouse operations sentence obtained, obtains each data warehouse purpose table.
Further, module 304 is built, including:
Construction unit, in data warehouse table genetic connection figure, the table name for building data warehouse purpose table to be corresponding Node, and build the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
As unit, for the corresponding node of table name of data warehouse purpose table is corresponding as data warehouse purpose table The child node of the corresponding node of table name of data warehouse source table.
Further, the device also includes:
Second memory module, for the data warehouse operations sentence for accessing data warehouse purpose table to be stored in into data warehouse In the corresponding node of table name of purpose table;
Sending module, for data warehouse table genetic connection figure to be sent into terminal, user is shown to by terminal.
In embodiments of the present invention, server parsing accesses each data warehouse operations sentence of data warehouse, obtains every The table name and the corresponding source data of each data warehouse purpose table for the data warehouse purpose table that individual data warehouse operations sentence is accessed The table name of warehouse storehouse table, and table name and the corresponding data bins of each data warehouse purpose table according to each data warehouse purpose table The table name of storehouse source table, it is automatic to build data warehouse table genetic connection figure, labor workload is reduced, also, improve structure The speed of data warehouse table genetic connection and the degree of accuracy.
It should be noted that:The device for the structure data warehouse table genetic connection figure that above-described embodiment is provided is building data , can be according to need only with the division progress of above-mentioned each functional module for example, in practical application during warehouse table genetic connection figure Want and above-mentioned functions are distributed and completed by different functional modules, i.e., the internal structure of device is divided into different function moulds Block, to complete all or part of function described above.In addition, the structure data warehouse table blood relationship that above-described embodiment is provided is closed It is that the device of figure and the embodiment of the method for structure data warehouse table genetic connection figure belong to same design, it is detailed that it implements process See embodiment of the method, repeat no more here.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims (8)

1. a kind of method for building data warehouse table genetic connection figure, it is characterised in that methods described includes:
Parsing accesses each data warehouse operations sentence of data warehouse, obtains what each data warehouse operations sentence was accessed The table name of data warehouse purpose table;
Obtain data warehouse operations sentence and corresponding lead-in path that task type is lead-in type;
Task type is obtained according to the lead-in path to be analysis type and there is the data warehouse operations language of the lead-in path Sentence;
It is analysis type and tool to bind data warehouse operations sentence and the task type that the task type is lead-in type There is the data warehouse operations sentence of the lead-in path;
The sentence for the data warehouse operations sentence that the task type is lead-in type is identified and the task type is analysis Type and with the lead-in path data warehouse operations sentence sentence mark with access data warehouse purpose table table The corresponding relation of name is stored in mapping table;
For every record in the mapping table, the sentence of the data warehouse operations sentence stored in the record is obtained The table name of mark and data warehouse purpose table;
Data warehouse operations sentence is obtained according to the sentence of acquisition mark;
The data warehouse operations sentence of the acquisition is parsed, the corresponding data warehouse of each data warehouse purpose table is obtained The table name of source table;
According to the table name of each data warehouse purpose table and the corresponding data warehouse source table of each data warehouse purpose table Table name, build data warehouse table genetic connection figure.
2. the method as described in claim 1, it is characterised in that the parsing accesses each data warehouse operations of data warehouse Sentence, obtains the table name for the data warehouse purpose table that each data warehouse operations sentence is accessed, including:
The parsing each data warehouse operations sentence for accessing data warehouse, obtains each data of the access data warehouse The corresponding access mode of warehouse operation sentence;
Obtain the data warehouse operations sentence that access mode is WriteMode;
The data warehouse operations sentence that the access mode is WriteMode is parsed, the data that the access mode is WriteMode are obtained The table name for all data warehouse purpose tables that warehouse operation sentence is accessed.
3. the method as described in claim 1, it is characterised in that the table name of each data warehouse purpose table of basis and described Each the table name of the corresponding data warehouse source table of data warehouse purpose table, builds data warehouse table genetic connection figure, including:
In data warehouse table genetic connection figure, the corresponding node of table name of the data warehouse purpose table is built, and builds institute State the corresponding node of table name of the corresponding data warehouse source table of data warehouse purpose table;
It regard the corresponding node of table name of the data warehouse purpose table as the corresponding data warehouse of the data warehouse purpose table The child node of the corresponding node of table name of source table.
4. method as claimed in claim 3, it is characterised in that the table name of the structure data warehouse purpose table is corresponding After node, methods described also includes:
The data warehouse operations sentence for accessing the data warehouse purpose table is stored in the table name of the data warehouse purpose table In corresponding node;
The data warehouse table genetic connection figure is sent to terminal, user is shown to by the terminal.
5. a kind of device for building data warehouse table genetic connection figure, it is characterised in that described device includes:
Parsing module, each data warehouse operations sentence of data warehouse is accessed for parsing, each data warehouse is obtained The table name for the data warehouse purpose table that action statement is accessed;
Second acquisition module, for obtaining the data warehouse operations sentence and corresponding importing that task type is lead-in type Path;
3rd acquisition module, for obtaining task type for analysis type and with the lead-in path according to the lead-in path Data warehouse operations sentence;
Binding module, be for binding data warehouse operations sentence and the task type that the task type is lead-in type Analysis type and the data warehouse operations sentence with the lead-in path;
First memory module, for being the sentence mark of the data warehouse operations sentence of lead-in type and institute by the task type State sentence mark and the number accessed that task type is analysis type and the data warehouse operations sentence with the lead-in path It is stored according to the corresponding relation of the table name of warehouse purpose table in mapping table;
First acquisition module, for according to the mapping table, obtaining each data warehouse mesh in the mapping table The corresponding data warehouse of table originate the table name of table;
Module is built, for the table name according to each data warehouse purpose table and the corresponding number of each data warehouse purpose table According to the table name of warehouse source table, data warehouse table genetic connection figure is built;
First acquisition module, including:
First acquisition unit, for for every record in the mapping table, obtaining the data stored in the record The sentence mark and the table name of data warehouse purpose table of warehouse operation sentence;
Second acquisition unit, for obtaining data warehouse operations sentence according to the sentence of acquisition mark;
3rd resolution unit, the data warehouse operations sentence for parsing the acquisition obtains each data warehouse purpose The table name of the corresponding data warehouse source table of table.
6. device as claimed in claim 5, it is characterised in that the parsing module, including:
First resolution unit, each data warehouse operations sentence for parsing the access data warehouse, obtains the access The corresponding access mode of each data warehouse operations sentence of data warehouse;
Acquiring unit, for obtaining the data warehouse operations sentence that access mode is WriteMode;
Second resolution unit, for parsing the data warehouse operations sentence that the access mode is WriteMode, obtains the access The table name for all data warehouse purpose tables that mode accesses for the data warehouse operations sentence of WriteMode.
7. device as claimed in claim 5, it is characterised in that the structure module, including:
Construction unit, in data warehouse table genetic connection figure, the table name for building the data warehouse purpose table to be corresponding Node, and build the corresponding node of table name of the corresponding data warehouse source table of the data warehouse purpose table;
As unit, for regarding the corresponding node of table name of the data warehouse purpose table as the data warehouse purpose table pair The child node of the corresponding node of table name for the data warehouse source table answered.
8. device as claimed in claim 7, it is characterised in that described device also includes:
Second memory module, for the data warehouse operations sentence for accessing the data warehouse purpose table to be stored in into the data In the corresponding node of table name of warehouse purpose table;
Sending module, for the data warehouse table genetic connection figure to be sent into terminal, user is shown to by the terminal.
CN201410072773.0A 2014-02-28 2014-02-28 A kind of method and apparatus for building data warehouse table genetic connection figure Active CN103902653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410072773.0A CN103902653B (en) 2014-02-28 2014-02-28 A kind of method and apparatus for building data warehouse table genetic connection figure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410072773.0A CN103902653B (en) 2014-02-28 2014-02-28 A kind of method and apparatus for building data warehouse table genetic connection figure

Publications (2)

Publication Number Publication Date
CN103902653A CN103902653A (en) 2014-07-02
CN103902653B true CN103902653B (en) 2017-08-01

Family

ID=50993976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410072773.0A Active CN103902653B (en) 2014-02-28 2014-02-28 A kind of method and apparatus for building data warehouse table genetic connection figure

Country Status (1)

Country Link
CN (1) CN103902653B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915390A (en) * 2015-05-25 2015-09-16 广州精点计算机科技有限公司 ETL data lineage query system and query method
CN105868521A (en) * 2015-12-14 2016-08-17 乐视网信息技术(北京)股份有限公司 Data information processing method and apparatus
CN106997369B (en) * 2016-01-26 2020-11-24 阿里巴巴集团控股有限公司 Data cleaning method and device
CN107239458B (en) * 2016-03-28 2021-01-29 阿里巴巴集团控股有限公司 Method and device for calculating development object relationship based on big data
CN108132957B (en) * 2016-12-01 2021-09-10 中国移动通信有限公司研究院 Database processing method and device
CN110019384B (en) * 2017-08-15 2023-06-27 阿里巴巴集团控股有限公司 Method for acquiring blood edge data, method and device for providing blood edge data
US10769165B2 (en) * 2017-12-20 2020-09-08 Sap Se Computing data lineage across a network of heterogeneous systems
CN108038248B (en) * 2017-12-28 2021-11-26 携程计算机技术(上海)有限公司 ETL dependency automatic identification method and system
CN110019315A (en) * 2018-06-19 2019-07-16 杭州数澜科技有限公司 A kind of method and apparatus for the parsing of data blood relationship
CN109614432B (en) * 2018-12-05 2021-01-05 北京百分点信息科技有限公司 System and method for acquiring data blood relationship based on syntactic analysis
CN109669981A (en) * 2018-12-21 2019-04-23 成都四方伟业软件股份有限公司 Data relationship management method, device, data relationship acquisition methods and storage medium
CN109857818B (en) * 2019-02-03 2021-09-14 北京字节跳动网络技术有限公司 Method and device for determining production relation, storage medium and electronic equipment
CN110008291B (en) * 2019-04-10 2022-03-11 北京字节跳动网络技术有限公司 Data early warning method and device, storage medium and electronic equipment
CN110232056B (en) * 2019-05-21 2022-02-25 苏宁云计算有限公司 Blood margin analysis method and tool of structured query language
CN110795509B (en) * 2019-09-29 2024-02-09 北京淇瑀信息科技有限公司 Method and device for constructing index blood-margin relation graph of data warehouse and electronic equipment
CN111125229A (en) * 2019-12-24 2020-05-08 杭州数梦工场科技有限公司 Data blood margin generation method and device and electronic equipment
CN111639143B (en) * 2020-06-05 2020-12-22 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN111782738B (en) * 2020-08-14 2021-08-17 北京斗米优聘科技发展有限公司 Method and device for constructing database table level blood relationship
CN112231203A (en) * 2020-09-28 2021-01-15 四川新网银行股份有限公司 Data warehouse test analysis method based on blood relationship
CN112434042A (en) * 2020-12-03 2021-03-02 深圳市欢太科技有限公司 Data relationship construction method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609473A (en) * 2009-07-30 2009-12-23 金蝶软件(中国)有限公司 A kind of method of Structured Query Language (SQL) of reconstruct report query and device
CN101859303A (en) * 2009-04-07 2010-10-13 中国移动通信集团湖北有限公司 Metadata management method and management system
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
US8468120B2 (en) * 2010-08-24 2013-06-18 International Business Machines Corporation Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud
CN103186541A (en) * 2011-12-27 2013-07-03 阿里巴巴集团控股有限公司 Generation method and device for mapping relationship

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
CN101859303A (en) * 2009-04-07 2010-10-13 中国移动通信集团湖北有限公司 Metadata management method and management system
CN101609473A (en) * 2009-07-30 2009-12-23 金蝶软件(中国)有限公司 A kind of method of Structured Query Language (SQL) of reconstruct report query and device
US8468120B2 (en) * 2010-08-24 2013-06-18 International Business Machines Corporation Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud
CN103186541A (en) * 2011-12-27 2013-07-03 阿里巴巴集团控股有限公司 Generation method and device for mapping relationship

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"数据仓库元数据的管理与运用";杨玢玢;《中国优秀硕士学位论文全文数据库 信息科技辑》;20111215;全文 *
"面向疑点核实的数据路径追踪技术研究";衡铁刚;《中国优秀硕士学位论文全文数据库 信息科技辑》;20120515;全文 *

Also Published As

Publication number Publication date
CN103902653A (en) 2014-07-02

Similar Documents

Publication Publication Date Title
CN103902653B (en) A kind of method and apparatus for building data warehouse table genetic connection figure
CN106897322B (en) A kind of access method and device of database and file system
US10180984B2 (en) Pivot facets for text mining and search
US20180144061A1 (en) Edge store designs for graph databases
CN103620601A (en) Joining tables in a mapreduce procedure
CN106547766A (en) A kind of data access method and device
CN110472068A (en) Big data processing method, equipment and medium based on heterogeneous distributed knowledge mapping
CN103345469B (en) The storage of set of numbers, querying method and device thereof
CN101493820A (en) Medicine Regulatory industry knowledge base platform and construct method thereof
CN107103067A (en) A kind of method of data synchronization and system based on search engine
CN103778133A (en) Database object changing method and device
CN104021123A (en) Method and system for data transfer
US10445370B2 (en) Compound indexes for graph databases
CN102591855A (en) Data identification method and data identification system
CN102346747A (en) Method for searching parameters in data model
CN108008936A (en) A kind of data processing method, device and electronic equipment
CN103714086A (en) Method and device used for generating non-relational data base module
CN106933845A (en) The method and apparatus that MDX inquires about effect are realized using SQL
CN110781183A (en) Method and device for processing incremental data in Hive database and computer equipment
CN103455335A (en) Multilevel classification Web implementation method
CN106960020A (en) A kind of method and apparatus for creating concordance list
CN105183916A (en) Device and method for managing unstructured data
CN106484699A (en) The generation method of data base querying field and device
CN106843899A (en) A kind of web development methods and device based on Node.js platforms
CN111414410A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 519000 High-tech Zone, Zhuhai City, Guangdong Province, Unit 1, Fourth Floor C, Building A, Headquarters Base No. 1, Qianwan Third Road, Tangjiawan Town

Patentee after: ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd.

Address before: 519080 Zone B, 1st Floor, Convention Center, No. 1, Software Park Road, Tangjiawan Town, Zhuhai, Guangdong

Patentee before: ZHUHAI DUOWAN INFORMATION TECHNOLOGY Ltd.