CN109325078A - Method and device is determined based on the data blood relationship of structured data - Google Patents

Method and device is determined based on the data blood relationship of structured data Download PDF

Info

Publication number
CN109325078A
CN109325078A CN201811090154.9A CN201811090154A CN109325078A CN 109325078 A CN109325078 A CN 109325078A CN 201811090154 A CN201811090154 A CN 201811090154A CN 109325078 A CN109325078 A CN 109325078A
Authority
CN
China
Prior art keywords
source
field
inventory
information
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811090154.9A
Other languages
Chinese (zh)
Inventor
梁福坤
张传凯
刘海宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rajax Network Technology Co Ltd
Lazhasi Network Technology Shanghai Co Ltd
Original Assignee
Lazhasi Network Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lazhasi Network Technology Shanghai Co Ltd filed Critical Lazhasi Network Technology Shanghai Co Ltd
Priority to CN201811090154.9A priority Critical patent/CN109325078A/en
Publication of CN109325078A publication Critical patent/CN109325078A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique

Abstract

The disclosure provides a kind of data blood relationship based on structured data and determines that method and device, method include: that parsing case statement obtains source abstract syntax tree, and the table information and field information that ergodic source abstract syntax tree obtains successively are organized into source inventory;Parsing insertion sentence obtains target abstract syntax tree, and the table information and field information that traversal target abstract syntax tree obtains successively are organized in target inventory;Ergodic source inventory obtains source table information, and traverses target inventory and obtain object table information, obtains the data genetic connection of table granularity;The aiming field information of object table is taken out from target inventory, since successively finding the source field in the source table of the same name with the aiming field information of object table the first layer of source inventory, when the source table belonging to the source field no longer derives from subquery, corresponding source field is determined as the corresponding source field with genetic connection of aiming field information.The program can realize the parsing of field level granularity to the data blood relationship based on structured data.

Description

Method and device is determined based on the data blood relationship of structured data
Technical field
This disclosure relates to technical field of data processing, and in particular to a kind of data blood relationship determination side based on structured data Method, device, electronic equipment and computer readable storage medium.
Background technique
Data blood relationship can be understood generally as the link of data generation currently without unified definition.The description of data blood relationship One table has relied on how the field in which table and table generates, and further even describes these fields again Which field dependent on other tables.The upstream and downstream dependence of data production is known that by data blood relationship.Data blood relationship It is mainly used in big data field, as background knowledge, learns about the entire production procedure of big data first.Big data it is whole Body production procedure is generally divided into data source, production, warehouse, four layers of data application, data source based on the mysql of business library, Secondary is file, kafka or mq of hdfs or ftp etc., produces level based on ETL system.The customary production of data is by bottom thing Real table and dimension table start, based on the fact that producing some middle tables with dimension, then regenerate Aggregation Table.When the business scale of construction is very big When, whole system can use thousands of up to ten thousand tables, will form extremely complex dependence between table and table.
Data blood relationship is mainly used to solve the problems, such as the data interpretation of big data field, and it is all big numbers that data are interpretable The problem faced is needed according to team, data interpretation mainly includes two aspects: data bore and data dependence. ETL developer, which often suffers from a problem that, to be sought to explain how your data are produced to data user.? Common production dependence is all the dependence between tables of data and production task in the industry, and production task has specifically been used Which field in trip tables of data could embody only in codimg logic, can not be exposed to data user.That is data User has only seen that data produce as a result, being entirely still flight data recorder for the process of data production, and does not know about, one Kind of solution be data are produced used in sql sentence (usually select and insertion sentence) pulled out from code Come, be made into configurable item, passes through the life of you can get it the data upstream and downstream of the parsing to configuration file (mainly sql sentence) in this way The blood relationship for producing dependence namely data can thus accomplish interpretable, the very big drop of the interpretable of data, production process Low data explain costs.
Available data blood relationship implementation has generally all only accomplished the parsing of table level granularity, i.e., can for an object table Show that its source Mr. Yu opens table or a few tables with retrospect.It can be existing to learn about by a specific example Technical solution, it is contemplated that the scene of a take-away all wants the moon order volume for counting trade company every month, then generally can be to order Detail list (t_order) and distribution information table (t_order_logistics_info) do a correlation inquiry, then will inquiry Result be stored in a title cry trade company the moon order table (t_order_shop_all_daily) in, why want correlation inquiry Be because in system design stage the fractionation that table is divided in point library can be carried out to business for the ease of the exploitation and maintenance of system, such as Order table, Yong Hubiao, trade company's table, commodity list, logistics distribution table are stored in after relevant business datum can be split with order respectively Deng thus showing that the moon order volume of trade company will go to inquiry order table and distribution information table, the trade company produced in this way Month order table is just and order table and distribution information table produce upstream and downstream dependence.Existing data blood relationship implementation can be with The parsing for realizing this dependence selects the available source table table name t_order and t_ of (select) sentence by parsing Order_logistics_info obtains object table table name t_order_shop_all_daily by being inserted into sentence, then may be used To obtain the dependence between table and table, but if it is desired to understand trade company month order table has relied on order table and logistics letter on earth Which field of table is ceased, existing technical solution just can't resolve.
Imagine such usage scenario, certain table T due to that cannot expire now in the unreasonable of design phase originally The current use demand of foot, needs to be adjusted table structure, may delete field c, but the table is in production environment A period of time has been run, has been that the upstream of many tables relies on, at this moment just needs clearly to have which field of which table will dependent on T table The field c to be deleted, but existing data blood relationship is realized, can only obtain the other dependence of table level, it is other for field level Rely on it is helpless, so can only by artificial means go screen and search, if there is upper thousand sheets table in system, can expend non- Often more human cost, and not can avoid statistics and go wrong and slip.
A scene is imagined again, and daily analysis personnel can inquire kind of index using sql to big data platform to divide Analysis uses, if each inquiry system, which can carry out response in grade time second, has extraordinary user experience, but can not It avoids some inquiries that from taking a long time, in order to promote data output efficiency, needs the use habit for user to system Table structure optimize, this just needs table in the sql used user and field to count, and obtains the temperature and word of table Section temperature information, first have to pay close attention to and optimize be exactly those users use frequent table and field, and the prior art be can not Meet this demand.
With the rapid development of Internet, the data generated because of network application are also being in explosive growth, how to have The production for imitating to manage big data accomplishes that the interpretable of data just compels highly necessary to solve the problems, such as one, but is directed to data The existing data blood relationship implementation of production can only accomplish the parsing of table granularity level, this can not just accomplish the fining of data Management, it would therefore be highly desirable to propose a kind of parsing that can realize field level granularity to the data blood relationship based on structured data.
Summary of the invention
The embodiment of the present disclosure provides a kind of data blood relationship based on structured data and determines method, apparatus, electronic equipment and meter Calculation machine readable storage medium storing program for executing, with realize to the data blood relationship based on structured data field level granularity parsing.
In a first aspect, providing a kind of data blood relationship based on structured data in the embodiment of the present disclosure determines method.
Specifically, the data blood relationship based on structured data determines method, comprising:
Case statement in analytic structure data obtains source abstract syntax tree, and will traverse the source abstract syntax tree and obtain Table information and field information successively organize into source inventory;Table in the source inventory is known as source table;In analytic structure data Insertion sentence obtain target abstract syntax tree, and the table information and field information that the target abstract syntax tree obtains will be traversed Successively tissue is into target inventory;
Table in the target inventory is known as object table;
Ergodic source inventory obtains source table information, and traverses target inventory and obtain object table information, obtains the data of table granularity Genetic connection;From the target inventory take out object table aiming field information, since the first layer of the source inventory by Layer finds source field in the source table of the same name with the aiming field information of the object table, the source table belonging to the source field Corresponding source field is determined as the corresponding source field with genetic connection of aiming field information when no longer deriving from subquery; The quantity of the aiming field information is at least one.
With reference to first aspect, the disclosure is in the first implementation of first aspect, the source inventory and target inventory Including at least one layer, each layer includes at least a table, and every table includes at least a field, and the structured data is described The number of plies of source inventory or the target inventory be embedded set subquery the number of plies and preset threshold and value, correlation inquiry with combine The table of inquiry and main table are in same layer.
With reference to first aspect with the first implementation of first aspect, the disclosure is in second of realization side of first aspect It is described that the source table of the same name with the aiming field information of the object table is successively found since the first layer of the source inventory in formula In source field, corresponding source field is determined as target when the source table belonging to the source field no longer derives from subquery The corresponding source field with genetic connection of field information includes:
The aiming field information of the object table is matched with the field in the source table of the first layer of the source inventory, Find source field of the same name;
Judge whether source table belonging to the source field of the same name derives from subquery;
If source table belonging to the source field of the same name does not derive from subquery, the source field of the same name is determined as The corresponding source field with genetic connection of the aiming field information;
If source table belonging to the source field of the same name derives from subquery, by the source field of the same name and the source It is matched in inventory since the source field in source table layer-by-layer the second layer, another source field of the same name is found, until described It is corresponding that corresponding source field is determined as aiming field information when source table belonging to another source field no longer derives from subquery Source field with genetic connection.
With reference to first aspect, second of implementation of the first implementation of first aspect and first aspect, this public affairs It is opened in the third implementation of first aspect, the title of each layer of table includes its institute in the source inventory and target inventory Belong to the name information of table.
With reference to first aspect, the first implementation of first aspect, first aspect second of implementation and first The third implementation of aspect, the disclosure is in the 4th kind of implementation of first aspect, in the analytic structure data Case statement obtains source abstract syntax tree
If the case statement is the sentence of the first kind, use resolver associated with the sentence of the first kind raw At the first abstract syntax tree;
If the case statement is the sentence of Second Type, use resolver associated with the sentence of Second Type raw At the second abstract syntax tree;
If the case statement is the sentence of third type, use resolver associated with the sentence of third type raw At third abstract syntax tree;
The source abstract syntax tree includes the first abstract syntax tree, the second abstract syntax tree and third abstract syntax tree.
With reference to first aspect, the first implementation, second of implementation of first aspect, first party of first aspect The third implementation in face and the 4th kind of implementation of first aspect, five kind implementation of the disclosure in first aspect In, the insertion sentence in the analytic structure data obtains source abstract syntax tree and includes:
If the insertion sentence is the sentence of the first kind, use resolver associated with the sentence of the first kind raw At the 4th abstract syntax tree;
If the insertion sentence is the sentence of Second Type, use resolver associated with the sentence of Second Type raw At the 5th abstract syntax tree;
If the insertion sentence is the sentence of third type, use resolver associated with the sentence of third type raw At the 6th abstract syntax tree;
The target abstract syntax tree includes the 4th abstract syntax tree, the 5th abstract syntax tree and the 6th abstract syntax tree.
Second aspect provides a kind of data blood relationship determining device based on structured data in the embodiment of the present disclosure.
Specifically, the data blood relationship determining device based on structured data, comprising:
Source inventory generation module, the case statement being configured as in analytic structure data obtain source abstract syntax tree, and will It traverses table information that the source abstract syntax tree obtains and field information is successively organized into source inventory;Table in the source inventory Referred to as source table;
Target inventory generation module, the insertion sentence being configured as in analytic structure data obtain target abstract syntax tree, And table information that the target abstract syntax tree obtains will be traversed and field information is successively organized in target inventory;The target Table in inventory is known as object table;
Data genetic connection determining module is configured as ergodic source inventory and obtains source table information, and traverses target inventory and obtain Object table information is taken, the data genetic connection of table granularity is obtained;The aiming field letter of object table is taken out from the target inventory Breath, since successively finding the source in the source table of the same name with the aiming field information of the object table the first layer of the source inventory Corresponding source field is determined as aiming field letter when the source table belonging to the source field no longer derives from subquery by field Cease the corresponding source field with genetic connection;The quantity of the aiming field information is at least one.
In conjunction with second aspect, the disclosure is in the first implementation of second aspect, the source inventory and target inventory Including at least one layer, each layer includes at least a table, and every table includes at least a field, and the structured data is described The number of plies of source inventory or the target inventory be embedded set subquery the number of plies and preset threshold and value, correlation inquiry with combine The table of inquiry and main table are in same layer.
In conjunction with the first of second aspect and second aspect implementation, the disclosure is in second of realization side of second aspect In formula, the data genetic connection determining module is configured to:
The aiming field information of the object table is matched with the field in the source table of the first layer of the source inventory, Find source field of the same name;
Judge whether source table belonging to the source field of the same name derives from subquery;
If source table belonging to the source field of the same name does not derive from subquery, the source field of the same name is determined as The corresponding source field with genetic connection of the aiming field information;
If source table belonging to the source field of the same name derives from subquery, by the source field of the same name and the source It is matched in inventory since the source field in source table layer-by-layer the second layer, another source field of the same name is found, until described It is corresponding that corresponding source field is determined as aiming field information when source table belonging to another source field no longer derives from subquery Source field with genetic connection.
In conjunction with the first of second aspect and second aspect implementation, the disclosure is in the third realization side of second aspect In formula, the title of each layer of table includes the name information of its affiliated table in the source inventory and target inventory.
In conjunction with the first of second aspect and second aspect implementation, the disclosure is in the 4th kind of realization side of second aspect In formula, the source inventory generation module is configured as:
If the case statement is the sentence of the first kind, use resolver associated with the sentence of the first kind raw At the first abstract syntax tree;
If the case statement is the sentence of Second Type, use resolver associated with the sentence of Second Type raw At the second abstract syntax tree;
If the case statement is the sentence of third type, use resolver associated with the sentence of third type raw At third abstract syntax tree;
First abstract syntax tree, the second abstract syntax tree and third abstract syntax tree are formed into the source abstract syntax Tree;
Table information that the source abstract syntax tree obtains will be traversed and field information is successively organized into source inventory.
In conjunction with the first of second aspect and second aspect implementation, the disclosure is in the 5th kind of realization side of second aspect In formula, the source inventory generation module is configured as:
The target inventory generation module is configured as:
If the insertion sentence is the sentence of the first kind, use resolver associated with the sentence of the first kind raw At the 4th abstract syntax tree;
If the insertion sentence is the sentence of Second Type, use resolver associated with the sentence of Second Type raw At the 5th abstract syntax tree;
If the insertion sentence is the sentence of third type, use resolver associated with the sentence of third type raw At the 6th abstract syntax tree;
4th abstract syntax tree, the 5th abstract syntax tree and the 6th abstract syntax tree are formed into the target and are abstracted language Method tree;
Table information that the target abstract syntax tree obtains will be traversed and field information is successively organized in target inventory.
The third aspect, the embodiment of the present disclosure provide a kind of electronic equipment, including memory and processor, the memory It executes in above-mentioned first aspect for storing one or more data blood relationship determining device of the support based on structured data based on knot The data blood relationship of structure data determines the computer instruction of method, the processor is configured to depositing in the memory for executing The computer instruction of storage.The data blood relationship determining device based on structured data can also include communication interface, for being based on The data blood relationship determining device and other equipment or communication of structured data.
Fourth aspect, the embodiment of the present disclosure provide a kind of computer readable storage medium, are based on structure number for storing According to data blood relationship determining device used in computer instruction, it includes for executing in above-mentioned first aspect based on structured data Data blood relationship determine method be the data blood relationship determining device based on structured data involved in computer instruction.
The technical solution that the embodiment of the present disclosure provides can include the following benefits:
Above-mentioned technical proposal, it is real by carrying out multi-zone supervision to table information and field information in source inventory and target inventory Now based on the dependence between structured data energy Lookup Field, realize that the data blood relationship based on structured data realizes field rank grain The parsing of degree.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.
Detailed description of the invention
In conjunction with attached drawing, by the detailed description of following non-limiting embodiment, the other feature of the disclosure, purpose and excellent Point will be apparent.In the accompanying drawings:
Fig. 1 shows the flow chart that method is determined based on the data blood relationship of structured data according to one embodiment of the disclosure;
Fig. 2 shows the data blood relationships based on structured data of embodiment according to Fig. 1 to determine the source inventory in method And the hierarchical relationship schematic diagram of the data structure of target inventory;
Fig. 3 shows the process that method is determined based on the data blood relationship of structured data according to another embodiment of the disclosure Figure;
Fig. 4 shows the structured data schematic diagram to be analyzed according to one embodiment of the disclosure;
Fig. 5 is shown determines method to shown in Fig. 4 according to the data blood relationship based on structured data of one embodiment of the disclosure The layering inventory schematic diagram that structured data is established;
Fig. 6, which is shown, determines what method determined according to a kind of data blood relationship based on structured data of one embodiment of the disclosure The corresponding inventory of structured data;
Fig. 7 is shown determines method to shown in Fig. 6 according to the data blood relationship based on structured data of one embodiment of the disclosure Inventory adjusts the inventory after table name;
Fig. 8 shows the case statement in the structured data to be analyzed according to another embodiment of the disclosure;
Fig. 9 is shown determines method to Fig. 8 processing according to the data blood relationship based on structured data of one embodiment of the disclosure Inventory later;
Figure 10 shows the insertion sentence in the structured data to be analyzed of another embodiment of the disclosure;
Figure 11 is shown determine method to Figure 10 according to the data blood relationship based on structured data of one embodiment of the disclosure at Inventory after reason;
Figure 12 is shown determine method to Figure 10 according to the data blood relationship based on structured data of one embodiment of the disclosure at Inventory after reason and the data genetic connection to the table granularity obtained after the inventory processing after Fig. 8 processing;
Figure 13 is shown determine method to Figure 10 according to the data blood relationship based on structured data of one embodiment of the disclosure at Inventory after reason and the data genetic connection to the field granularity obtained after the inventory processing after Fig. 8 processing;
Figure 14 shows the structural block diagram of the data blood relationship determining device according to the structured data of one embodiment of the disclosure;
Figure 15 shows the structural block diagram of the electronic equipment according to one embodiment of the disclosure;
Figure 16 is adapted for for realizing the data blood relationship determination side based on structured data according to one embodiment of the disclosure The structural schematic diagram of the computer system of method.
Specific embodiment
Hereinafter, the illustrative embodiments of the disclosure will be described in detail with reference to the attached drawings, so that those skilled in the art can Easily realize them.In addition, for the sake of clarity, the portion unrelated with description illustrative embodiments is omitted in the accompanying drawings Point.
In the disclosure, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer to disclosed in this specification Feature, number, step, behavior, the presence of component, part or combinations thereof, and be not intended to exclude other one or more features, A possibility that number, step, behavior, component, part or combinations thereof exist or are added.
It also should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure It can be combined with each other.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
The embodiment of the present disclosure provide technical solution by for data blood relationship often to be parsed comprising subquery, association The complex types of data (for example, SQL statement) of inquiry and conjunctive query, using layering thought to the AST information of resolver into Row reorganizes, then according to the aiming field of the layer-by-layer downward matched and searched source field of field name, to realize field granularity The data blood relationship of rank parses, and can provide precisely in the case where user needs to know the production link of certain field It is bright, solve the problems, such as the interpretation of data, this provides the management and regulation more refined also for the production process of big data Means have huge practical value.
Fig. 1 shows the flow chart that method is determined based on the data blood relationship of structured data according to one embodiment of the disclosure. Wherein structured data may include various types of structural datas, structured language, structural representation mode etc..In order to clear For the sake of, example of the application using structured query language SQL statement as structured data is illustrated, but fields technology people Member is it will be appreciated that the application is not limited to sql like language.In addition, the application is by MySQL, Hive and Impala tool and phase The resolver answered is illustrated as specific example, similarly, one of ordinary skill in the art it will be appreciated that the application not It is limited to using MySQL, Hive and Impala tool and corresponding resolver, but any data processing tools and phase can be used The resolver answered.
As shown in Figure 1, the data blood relationship based on structured data determines that method includes the following steps S101-S103:
In step s101, the case statement in analytic structure data obtains source abstract syntax tree, and will traverse the source The table information and field information that abstract syntax tree obtains successively are organized into source inventory;Table in the source inventory is known as source table;
In step s 102, the insertion sentence in analytic structure data obtains target abstract syntax tree, and will be described in traversal The table information and field information that target abstract syntax tree obtains successively are organized in target inventory;Table in the target inventory claims For object table;
In step s 103, ergodic source inventory obtains source table information, and traverses target inventory and obtain object table information, obtains The data genetic connection of table granularity;The aiming field information that object table is taken out from the target inventory, from the source inventory First layer starts successively to find the source field in the source table of the same name with the aiming field information of the object table, until the source word Corresponding source field is determined as that aiming field information is corresponding to have blood relationship when no longer deriving from subquery by source table belonging to section The source field of relationship;The quantity of the aiming field information is at least one.
The process employs a kind of thoughts of layering, i.e., carry out in source inventory and target inventory to table information and field information Multi-zone supervision makes it possible the dependence between Lookup Field, and can effectively solve the statistical problem of field temperature.
The data structure for the layering that the present embodiment uses, the hierarchical relationship of data structure are as shown in Figure 2: where described Source inventory and target inventory include at least one layer, and each layer includes at least a table, and every table includes at least a field, knot Structure data (for example, SQL statement) are in the number of plies that the number of plies of the source inventory or the target inventory is embedded set subquery and in advance If threshold value (preset threshold can with but be not limited to 1) and value, the table of correlation inquiry and conjunctive query is with main table same Layer.Specifically as shown in Fig. 2, being followed successively by layer, table, field from top to bottom, each layer may include one or more table, and every table can With comprising one or more fields, to Mr. Yu sql, if the sql does not include subquery, the number of plies of the sql is one layer, such as The nested straton inquiry of fruit, then the number of plies is two layers, and so on, the table and main table of correlation inquiry (join) be in same layer, connection The table and main table for closing inquiry (union) are also in same layer.
In an optional implementation of the present embodiment, as shown in figure 3, the step S101, i.e., in parsing SQL statement Case statement the step of obtaining source abstract syntax tree, comprising the following steps:
(1) judge whether the case statement is MySQL sentence;
(2) if the case statement is MySQL sentence, the first abstract syntax tree is generated using Druid resolver;
(3) judge whether the case statement is Hive sentence;
(4) if the case statement is Hive sentence, the second abstract syntax tree is generated using Hive resolver;
(5) judge whether the case statement is Impala sentence;
(6) if the case statement is Impala sentence, third abstract syntax tree is generated using Impala resolver.
Continue shown in Fig. 3, the step S102, i.e., the insertion sentence in parsing SQL statement obtains target abstract syntax tree The step of, comprising the following steps:
(1) judge whether the insertion sentence is MySQL sentence;
(2) if the insertion sentence is MySQL sentence, the first abstract syntax tree is generated using Druid resolver;
(3) judge whether the insertion sentence is Hive sentence;
(4) if the insertion sentence is Hive sentence, the second abstract syntax tree is generated using Hive resolver;
(5) judge whether the insertion sentence is Impala sentence;
(6) if the insertion sentence is Impala sentence, third abstract syntax tree is generated using Impala resolver.
The step S103 takes out the aiming field information of object table, from the source inventory that is, from the target inventory First layer start successively to find the source field in the source table of the same name with the aiming field information of the object table, until the source When source table belonging to field no longer derives from subquery by corresponding source field be determined as aiming field information it is corresponding have blood The source field of edge relationship;The step of quantity of the aiming field information is at least one, comprising the following steps:
The aiming field information of the object table is matched with the field in the source table of the first layer of the source inventory, Find source field of the same name;
Judge whether source table belonging to the source field of the same name derives from subquery;
If source table belonging to the source field of the same name does not derive from subquery, the source field of the same name is determined as The corresponding source field with genetic connection of the aiming field information;
If source table belonging to the source field of the same name derives from subquery, by the source field of the same name and the source It is matched in inventory since the source field in source table layer-by-layer the second layer, another source field of the same name is found, until described It is corresponding that corresponding source field is determined as aiming field information when source table belonging to another source field no longer derives from subquery Source field with genetic connection.
In the present embodiment, before to sql parsing, the type of sql sentence can be judged, for mysql, hive It is parsed respectively using different sql resolvers with impala three types, although the AST structure that three kinds of resolvers generate is not Together, but eventually by table and the information unification tissue of field into SqlLayer, such table and field level blood relationship dependency analysis And the code of table and field hot statistics need to only be programmed for SqlLayer, reduce the realization difficulty of program.
By taking the complicated sql sentence with multilayer nest inquiry and correlation inquiry as an example, layering implementation, sql language are explained Sentence is as shown in figure 4, the sql sentence can be with abstract representation at form as shown in Figure 5 with layering thought above.It can from Fig. 5 Clearly to find out that complexity sql sentence shown in Fig. 4 is segmented into shown in upper figure 3 layers, first layer includes 1 table t, the second layer Comprising 3 tables, third layer includes 6 tables, the s1 and s2 of the second layer, is left join incidence relation between s3, with band in figure plus Number green circle indicate.Downward arrow indicates that there are also subquery, after layering, this can be clearly seen in the table The institutional framework of sql, consequently facilitating the consanguinity analysis of next step.
Before really starting field level dependency analysis, it is also necessary to solve the problems, such as one, problem is illustrated in fig. 6 shown below, from Fig. 6 As can be seen that this sql sentence has been also point 3 layers, the problem of this sentence, is between different layers there are table of the same name, Such as all there is the table that table name is a in 3 layers, it just will appear when doing field level and relying on parsing obscure in this way, it is contemplated that one A field derives from a table of third layer, if do not distinguished to 3 a tables, if searched according to table name, it is possible to by the word Section is resolved to a table (if by chance the two different a tables contain same file-name field) of the second layer, solution as shown in fig. 7, A parent attribute is defined, in SqlTable class for saving upper one layer of pathname, a of the second layer belonging to the table The parent of table is b, i.e. the b table of first layer, and the parent of a table of third layer is e.t, i.e. b table under first layer e table, because Same layer will not correlation inquiry table of the same name, so parent plus table name will become a table unique identification, as above scheme Shown, a table of first layer is a, and a table of the second layer is b.a, and a table of third layer is e.t.a, thus can uniquely be distinguished same The table of name.
It in order to illustrate the parsing for how accomplishing that field level blood relationship relies on, then gives one example, specifically there is two sql, one Case statement, an insertion sentence, case statement are responsible for inquiring required field from source table, in order to illustrate how to carry out successively Search, using with double-layer structure then not counting complicated case statement, insertion sentence are responsible for object table insertion field, target It is which table found out the field being inserted into insertion sentence and be derived from case statement, case statement is as shown in figure 8, the choosing Structure as shown in Figure 9 can be expressed as by selecting sentence layering thought.
Information comprising the select field under each table in current hierarchical diagram, it can be seen that first layer is by a Left join b is constituted, and a table includes a subquery, and b table also includes a subquery, and subquery is all located at the second layer.It connects down See insertion sentence again, as shown in Figure 10.
Insertion sentence is fairly simple, and hierarchical diagram is as shown in figure 11.Being inserted into sentence only includes one layer of structure, and the inside only has One table includes 3 fields under the table, illustrates how to carry out looking into for genetic connection below by taking business_num field as an example It looks for.
The same file-name field of business_num, query result can be begun looking for from the first layer of case statement for the first time are as follows:
sum(a.business_num)as business_num
That is business_num derives from the business_num of table a, and the key point successively searched is to find matching field The type of table is made a decision afterwards, i.e., to judge whether table a includes subquery, and SqlTable class has an attribute isSubQuery It is to stop searching if not including for characterizing whether the table includes subquery, it is believed that the business_num of table a is to insert The source field for entering sentence business_num needs to continue if comprising subquery to search to next layer, because of table a packet here Containing subquery, so needing to continue to search, the business_ of energy matching list a is continued to search in the table of second layer parent=a The field of num notices that need exist for matched field is substituted for via the business_num of t_district_result table The business_num of a table, i.e., every replacement that source field can be all done to next layer, the second layer search the result is that:
Count (if (flag=3, shop_id, null)) as business_num
That is business_num derives from flag the and shop_id field of table t_business, and paying attention to here also can be by if Flag field in Rule of judgment can be regarded as the source field of business_num because business_num production when really Flag field has been used, has then proceeded to judge whether table t_business includes subquery, because t_business is a reality Table, so the blood relationship parsing of business_num just finishes, its source field is exactly the flag and shop_ of table t_business Id field.
District_id the and business_rate field of parsing t_district_result table can similarly be continued, The data table level that will eventually get following Figure 12 relies on and the data field grade dependence of such as Figure 13.
Following is embodiment of the present disclosure, can be used for executing embodiments of the present disclosure.
Figure 14 shows the structural frames of the data blood relationship determining device based on structured data according to one embodiment of the disclosure Figure, which being implemented in combination with as some or all of of electronic equipment by software, hardware or both.Such as Figure 14 Shown, the data blood relationship determining device based on structured data includes:
Source inventory generation module 1401, the case statement being configured as in analytic structure data obtains source abstract syntax tree, And table information that the source abstract syntax tree obtains will be traversed and field information is successively organized into source inventory;In the source inventory Table be known as source table;
Target inventory generation module 1402, the insertion sentence being configured as in analytic structure data obtain target abstract syntax Tree, and table information that the target abstract syntax tree obtains will be traversed and field information is successively organized in target inventory;It is described Table in target inventory is known as object table;
Data genetic connection determining module 1403 is configured as ergodic source inventory and obtains source table information, and it is clear to traverse target It is single to obtain object table information, obtain the data genetic connection of table granularity;The target word of object table is taken out from the target inventory Segment information, since successively being found in the source table of the same name with the aiming field information of the object table the first layer of the source inventory Source field, corresponding source field is determined as target word when the source table belonging to the source field no longer derives from subquery The corresponding source field with genetic connection of segment information;The quantity of the aiming field information is at least one.
Need exist for illustrating: the data blood relationship provided by the above embodiment based on structured data (for example, SQL statement) determines Device can realize technical solution described in above-mentioned each method embodiment, and the principle that above-mentioned each module or submodule implement can Referring to the corresponding contents in above-mentioned each method embodiment, details are not described herein again.
The disclosure also discloses a kind of electronic equipment, and Figure 15 shows the knot of the electronic equipment according to one embodiment of the disclosure Structure block diagram, as shown in figure 15, the electronic equipment 1500 include memory 1501 and processor 1502;Wherein,
The memory 1501 is for storing one or more computer instruction, wherein one or more computer Instruction is executed by the processor 1502 to realize any of the above-described method and step.
Figure 16 is suitable for being used to realizing the calculating for determining method according to the data blood relationship of the structured data of disclosure embodiment The structural schematic diagram of machine system.
As shown in figure 16, computer system 1600 include central processing unit (CPU) 1601, can according to be stored in only It reads the program in memory (ROM) 1602 or is loaded into random access storage device (RAM) 1603 from storage section 1608 Program and execute the kind in above embodiment processing.In RAM1603, be also stored with system 1600 operate required program and Data.CPU1601, ROM1602 and RAM1603 are connected with each other by bus 1604.Input/output (I/O) interface 1605 It is connected to bus 1604.
I/O interface 1605 is connected to lower component: the importation 1606 including keyboard, mouse etc.;Including such as cathode The output par, c 1607 of ray tube (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section including hard disk etc. 1608;And the communications portion 1609 of the network interface card including LAN card, modem etc..Communications portion 1609 passes through Communication process is executed by the network of such as internet.Driver 1610 is also connected to I/O interface 1605 as needed.It is detachable to be situated between Matter 1611, such as disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 1610, so as to In being mounted into storage section 1608 as needed from the computer program read thereon.
Particularly, according to embodiment of the present disclosure, method as described above may be implemented as computer software programs. For example, embodiment of the present disclosure includes a kind of computer program product comprising be tangibly embodied in and its readable medium on Computer program, the computer program includes that program generation of method is determined for executing the data blood relationship of the SQL statement Code.In such an embodiment, which can be downloaded and installed from network by communications portion 1609, And/or it is mounted from detachable media 1611.
Flow chart and block diagram in attached drawing are illustrated according to the system of disclosure kind embodiment, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in course diagram or block diagram can generation A part of one module, section or code of table, a part of the module, section or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit or module involved in disclosure embodiment can be realized by way of software, can also It is realized in a manner of through hardware.Described unit or module also can be set in the processor, these units or module Title do not constitute the restriction to the unit or module itself under certain conditions.
As on the other hand, the disclosure additionally provides a kind of computer readable storage medium, the computer-readable storage medium Matter can be computer readable storage medium included in device described in above embodiment;It is also possible to individualism, Without the computer readable storage medium in supplying equipment.Computer-readable recording medium storage has one or more than one journey Sequence, described program is used to execute by one or more than one processor is described in disclosed method.
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (10)

1. a kind of data blood relationship based on structured data determines method characterized by comprising
Case statement in analytic structure data obtains source abstract syntax tree, and the table that will be traversed the source abstract syntax tree and obtain Information and field information are successively organized into source inventory;Table in the source inventory is known as source table;
Insertion sentence in analytic structure data obtains target abstract syntax tree, and will traverse the target abstract syntax tree and obtain Table information and field information successively organize in target inventory;Table in the target inventory is known as object table;
It traverses the source inventory and obtains source table information, and traverse the target inventory and obtain object table information, obtain table granularity Data genetic connection;The aiming field information that object table is taken out from the target inventory, is opened from the first layer of the source inventory Begin successively to find the source field in the source table of the same name with the aiming field information of the object table, belonging to the source field Corresponding source field is determined as the corresponding source with genetic connection of aiming field information when source table no longer derives from subquery Field;The quantity of the aiming field information is at least one.
2. the method according to claim 1, wherein the source inventory and target inventory include at least one layer, Each layer includes at least a table, and every table includes at least a field, and the structured data is in the source inventory or the mesh The number of plies of SD list be embedded set subquery the number of plies and preset threshold and value, the table and main table of correlation inquiry and conjunctive query In same layer.
3. according to the method described in claim 2, it is characterized in that, described successively find since the first layer of the source inventory Source field with the aiming field information of the object table in source table of the same name, the source table belonging to the source field no longer come Corresponding source field, which is determined as the corresponding source field with genetic connection of aiming field information, when derived from subquery includes:
The aiming field information of the object table is matched with the field in the source table of the first layer of the source inventory, is found Source field of the same name;
Judge whether source table belonging to the source field of the same name derives from subquery;
If source table belonging to the source field of the same name does not derive from subquery, the source field of the same name is determined as described The corresponding source field with genetic connection of aiming field information;
If source table belonging to the source field of the same name derives from subquery, by the source field of the same name and the source inventory In matched since the source field in source table layer-by-layer the second layer, another source field of the same name is found, until described another Corresponding source field is determined as to aiming field information is corresponding to be had when source table belonging to source field no longer derives from subquery The source field of genetic connection.
4. method according to any one of claim 1-3, which is characterized in that each in the source inventory and target inventory The title of the table of layer includes the name information of its affiliated table.
5. a kind of data blood relationship determining device based on structured data characterized by comprising
Source inventory generation module, the case statement being configured as in analytic structure data obtain source abstract syntax tree, and will traversal The table information and field information that the source abstract syntax tree obtains successively are organized into source inventory;Table in the source inventory is known as Source table;
Target inventory generation module, the insertion sentence being configured as in analytic structure data obtain target abstract syntax tree, and will It traverses table information that the target abstract syntax tree obtains and field information is successively organized in target inventory;The target inventory In table be known as object table;
Data genetic connection determining module is configured as ergodic source inventory and obtains source table information, and traverses target inventory and obtain mesh Table information is marked, the data genetic connection of table granularity is obtained;The aiming field information that object table is taken out from the target inventory, from The first layer of the source inventory starts successively to find the source field in the source table of the same name with the aiming field information of the object table, Corresponding source field is determined as aiming field information pair when the source table belonging to the source field no longer derives from subquery The source field with genetic connection answered;The quantity of the aiming field information is at least one.
6. device according to claim 5, which is characterized in that the source inventory and target inventory include at least one layer, Each layer includes at least a table, and every table includes at least a field, and the structured data is in the source inventory or the mesh The number of plies of SD list be embedded set subquery the number of plies and preset threshold and value, the table and main table of correlation inquiry and conjunctive query In same layer.
7. device according to claim 6, which is characterized in that the data genetic connection determining module is further configured Are as follows:
The aiming field information of the object table is matched with the field in the source table of the first layer of the source inventory, is found Source field of the same name;
Judge whether source table belonging to the source field of the same name derives from subquery;
If source table belonging to the source field of the same name does not derive from subquery, the source field of the same name is determined as described The corresponding source field with genetic connection of aiming field information;
If source table belonging to the source field of the same name derives from subquery, by the source field of the same name and the source inventory In matched since the source field in source table layer-by-layer the second layer, another source field of the same name is found, until described another Corresponding source field is determined as to aiming field information is corresponding to be had when source table belonging to source field no longer derives from subquery The source field of genetic connection.
8. the device according to any one of claim 5-7, which is characterized in that each in the source inventory and target inventory The title of the table of layer includes the name information of its affiliated table.
9. a kind of electronic equipment, which is characterized in that including memory and processor;Wherein,
The memory is for storing one or more computer instruction, wherein one or more computer instruction is by institute Processor is stated to execute to realize the described in any item method and steps of claim 1-4.
10. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that the computer instruction quilt Claim 1-4 described in any item method and steps are realized when processor executes.
CN201811090154.9A 2018-09-18 2018-09-18 Method and device is determined based on the data blood relationship of structured data Pending CN109325078A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811090154.9A CN109325078A (en) 2018-09-18 2018-09-18 Method and device is determined based on the data blood relationship of structured data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811090154.9A CN109325078A (en) 2018-09-18 2018-09-18 Method and device is determined based on the data blood relationship of structured data

Publications (1)

Publication Number Publication Date
CN109325078A true CN109325078A (en) 2019-02-12

Family

ID=65266334

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811090154.9A Pending CN109325078A (en) 2018-09-18 2018-09-18 Method and device is determined based on the data blood relationship of structured data

Country Status (1)

Country Link
CN (1) CN109325078A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008231A (en) * 2019-03-19 2019-07-12 福建省天奕网络科技有限公司 MySQL data retrogressive method, storage medium
CN110083639A (en) * 2019-04-25 2019-08-02 中电科嘉兴新型智慧城市科技发展有限公司 A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
CN110096513A (en) * 2019-04-10 2019-08-06 阿里巴巴集团控股有限公司 A kind of data query, fund checking method and device
CN110362579A (en) * 2019-07-19 2019-10-22 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN110442604A (en) * 2019-07-11 2019-11-12 新华三大数据技术有限公司 Data flow querying method, abstracting method, processing method and relevant apparatus
CN110633333A (en) * 2019-09-25 2019-12-31 京东数字科技控股有限公司 Data blood relationship processing method and system, computing device and medium
CN110889286A (en) * 2019-10-12 2020-03-17 平安科技(深圳)有限公司 Dependency relationship identification method and device based on data table and computer equipment
CN110908997A (en) * 2019-10-09 2020-03-24 支付宝(杭州)信息技术有限公司 Data blood margin construction method and device, server and readable storage medium
CN111046242A (en) * 2019-11-27 2020-04-21 支付宝(杭州)信息技术有限公司 Data processing method, device, equipment and medium
CN111078729A (en) * 2019-12-19 2020-04-28 医渡云(北京)技术有限公司 Medical data tracing method, device, system, storage medium and electronic equipment
CN111338966A (en) * 2020-03-05 2020-06-26 中国银行股份有限公司 Big data processing detection method and device of data source table
CN111538744A (en) * 2020-07-08 2020-08-14 浙江大华技术股份有限公司 Method and device for processing data blood margin
CN111538743A (en) * 2020-04-22 2020-08-14 电子科技大学 SQL-based data blood relationship analysis method and system
CN111639143A (en) * 2020-06-05 2020-09-08 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN111666326A (en) * 2020-05-29 2020-09-15 中国工商银行股份有限公司 ETL scheduling method and device
CN111782265A (en) * 2020-06-28 2020-10-16 中国工商银行股份有限公司 Software resource system based on field level blood relationship and establishment method thereof
CN112035508A (en) * 2020-08-27 2020-12-04 深圳天源迪科信息技术股份有限公司 SQL (structured query language) -based online metadata analysis method, system and equipment
CN112231203A (en) * 2020-09-28 2021-01-15 四川新网银行股份有限公司 Data warehouse test analysis method based on blood relationship
CN112256721A (en) * 2020-10-21 2021-01-22 平安科技(深圳)有限公司 SQL statement parsing method, system, computer device and storage medium
CN112328599A (en) * 2020-11-12 2021-02-05 杭州数梦工场科技有限公司 Metadata-based field blood relationship analysis method and device
CN112783857A (en) * 2020-12-31 2021-05-11 北京知因智慧科技有限公司 Data blood reason management method and device, electronic equipment and storage medium
CN112860811A (en) * 2021-02-05 2021-05-28 北京百度网讯科技有限公司 Method and device for determining data blood relationship, electronic equipment and storage medium
CN112860812A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Information processing method, apparatus, device, storage medium, and program product
CN113127478A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Method and device for analyzing blood genesis relationship in data and computer equipment
CN113138990A (en) * 2021-05-17 2021-07-20 青岛海信网络科技股份有限公司 Data blood margin construction and tracing method, device and equipment
WO2021174945A1 (en) * 2020-10-21 2021-09-10 平安科技(深圳)有限公司 Data cost calculation method, system, computer device, and storage medium
WO2022017465A1 (en) * 2020-07-24 2022-01-27 华为技术有限公司 Data lineage presentation method, apparatus, and system
CN114185958A (en) * 2021-11-18 2022-03-15 招联消费金融有限公司 Blood relationship generation method and device, computer equipment and storage medium
CN114676678A (en) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 Structured query language data parsing method and device and electronic equipment
CN115062049A (en) * 2022-07-28 2022-09-16 浙江城云数字科技有限公司 Data blood margin analysis method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424269A (en) * 2013-08-30 2015-03-18 中国电信股份有限公司 Data linage analysis method and device
CN105912595A (en) * 2016-04-01 2016-08-31 华南理工大学 Data origin collection method of relational databases
US20170024382A1 (en) * 2015-07-20 2017-01-26 International Business Machines Corporation Data migration and table manipulation in a database management system
CN106484520A (en) * 2016-10-17 2017-03-08 北京集奥聚合科技有限公司 A kind of intelligent dispatching method based on data blood relationship and system
CN106709024A (en) * 2016-12-28 2017-05-24 深圳市华傲数据技术有限公司 Data table source-tracing method and device based on consanguinity analysis
CN107644073A (en) * 2017-09-18 2018-01-30 广东中标数据科技股份有限公司 A kind of field consanguinity analysis method, system and device based on depth-first traversal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424269A (en) * 2013-08-30 2015-03-18 中国电信股份有限公司 Data linage analysis method and device
US20170024382A1 (en) * 2015-07-20 2017-01-26 International Business Machines Corporation Data migration and table manipulation in a database management system
CN105912595A (en) * 2016-04-01 2016-08-31 华南理工大学 Data origin collection method of relational databases
CN106484520A (en) * 2016-10-17 2017-03-08 北京集奥聚合科技有限公司 A kind of intelligent dispatching method based on data blood relationship and system
CN106709024A (en) * 2016-12-28 2017-05-24 深圳市华傲数据技术有限公司 Data table source-tracing method and device based on consanguinity analysis
CN107644073A (en) * 2017-09-18 2018-01-30 广东中标数据科技股份有限公司 A kind of field consanguinity analysis method, system and device based on depth-first traversal

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008231A (en) * 2019-03-19 2019-07-12 福建省天奕网络科技有限公司 MySQL data retrogressive method, storage medium
CN110096513A (en) * 2019-04-10 2019-08-06 阿里巴巴集团控股有限公司 A kind of data query, fund checking method and device
CN110083639A (en) * 2019-04-25 2019-08-02 中电科嘉兴新型智慧城市科技发展有限公司 A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
CN110083639B (en) * 2019-04-25 2023-03-10 中电科嘉兴新型智慧城市科技发展有限公司 Intelligent data blood source tracing method and device based on cluster analysis
CN110442604B (en) * 2019-07-11 2022-03-11 新华三大数据技术有限公司 Data flow direction query method, data flow direction extraction method, data flow direction processing method and related devices
CN110442604A (en) * 2019-07-11 2019-11-12 新华三大数据技术有限公司 Data flow querying method, abstracting method, processing method and relevant apparatus
CN110362579A (en) * 2019-07-19 2019-10-22 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN110633333A (en) * 2019-09-25 2019-12-31 京东数字科技控股有限公司 Data blood relationship processing method and system, computing device and medium
CN110908997A (en) * 2019-10-09 2020-03-24 支付宝(杭州)信息技术有限公司 Data blood margin construction method and device, server and readable storage medium
CN110889286A (en) * 2019-10-12 2020-03-17 平安科技(深圳)有限公司 Dependency relationship identification method and device based on data table and computer equipment
CN110889286B (en) * 2019-10-12 2022-04-12 平安科技(深圳)有限公司 Dependency relationship identification method and device based on data table and computer equipment
CN111046242B (en) * 2019-11-27 2023-09-26 支付宝(杭州)信息技术有限公司 Data processing method, device, equipment and medium
CN111046242A (en) * 2019-11-27 2020-04-21 支付宝(杭州)信息技术有限公司 Data processing method, device, equipment and medium
CN111078729B (en) * 2019-12-19 2023-04-28 医渡云(北京)技术有限公司 Medical data tracing method, device, system, storage medium and electronic equipment
CN111078729A (en) * 2019-12-19 2020-04-28 医渡云(北京)技术有限公司 Medical data tracing method, device, system, storage medium and electronic equipment
CN113127478A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Method and device for analyzing blood genesis relationship in data and computer equipment
CN111338966A (en) * 2020-03-05 2020-06-26 中国银行股份有限公司 Big data processing detection method and device of data source table
CN111338966B (en) * 2020-03-05 2023-09-19 中国银行股份有限公司 Big data processing detection method and device of data source table
CN111538743B (en) * 2020-04-22 2023-08-18 电子科技大学 SQL-based data blood relationship analysis method and system
CN111538743A (en) * 2020-04-22 2020-08-14 电子科技大学 SQL-based data blood relationship analysis method and system
CN111666326A (en) * 2020-05-29 2020-09-15 中国工商银行股份有限公司 ETL scheduling method and device
CN111639143A (en) * 2020-06-05 2020-09-08 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN111639143B (en) * 2020-06-05 2020-12-22 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN111782265A (en) * 2020-06-28 2020-10-16 中国工商银行股份有限公司 Software resource system based on field level blood relationship and establishment method thereof
CN111782265B (en) * 2020-06-28 2024-02-02 中国工商银行股份有限公司 Software resource system based on field-level blood-relation and establishment method thereof
CN111538744A (en) * 2020-07-08 2020-08-14 浙江大华技术股份有限公司 Method and device for processing data blood margin
WO2022017465A1 (en) * 2020-07-24 2022-01-27 华为技术有限公司 Data lineage presentation method, apparatus, and system
CN112035508A (en) * 2020-08-27 2020-12-04 深圳天源迪科信息技术股份有限公司 SQL (structured query language) -based online metadata analysis method, system and equipment
CN112231203A (en) * 2020-09-28 2021-01-15 四川新网银行股份有限公司 Data warehouse test analysis method based on blood relationship
WO2021179722A1 (en) * 2020-10-21 2021-09-16 平安科技(深圳)有限公司 Sql statement parsing method and system, and computer device and storage medium
WO2021174945A1 (en) * 2020-10-21 2021-09-10 平安科技(深圳)有限公司 Data cost calculation method, system, computer device, and storage medium
CN112256721A (en) * 2020-10-21 2021-01-22 平安科技(深圳)有限公司 SQL statement parsing method, system, computer device and storage medium
CN112328599A (en) * 2020-11-12 2021-02-05 杭州数梦工场科技有限公司 Metadata-based field blood relationship analysis method and device
CN112783857B (en) * 2020-12-31 2023-10-20 北京知因智慧科技有限公司 Data blood-margin management method and device, electronic equipment and storage medium
CN112783857A (en) * 2020-12-31 2021-05-11 北京知因智慧科技有限公司 Data blood reason management method and device, electronic equipment and storage medium
CN112860811A (en) * 2021-02-05 2021-05-28 北京百度网讯科技有限公司 Method and device for determining data blood relationship, electronic equipment and storage medium
CN112860811B (en) * 2021-02-05 2023-07-18 北京百度网讯科技有限公司 Method and device for determining data blood relationship, electronic equipment and storage medium
CN112860812B (en) * 2021-02-09 2023-07-11 北京百度网讯科技有限公司 Method and device for non-invasively determining data field level association relation in big data
CN112860812A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Information processing method, apparatus, device, storage medium, and program product
CN113138990B (en) * 2021-05-17 2023-04-18 青岛海信网络科技股份有限公司 Data blood margin construction and tracing method, device and equipment
CN113138990A (en) * 2021-05-17 2021-07-20 青岛海信网络科技股份有限公司 Data blood margin construction and tracing method, device and equipment
CN114185958A (en) * 2021-11-18 2022-03-15 招联消费金融有限公司 Blood relationship generation method and device, computer equipment and storage medium
CN114185958B (en) * 2021-11-18 2024-04-02 招联消费金融股份有限公司 Blood relationship generation method, device, computer equipment and storage medium
CN114676678A (en) * 2022-04-08 2022-06-28 北京百度网讯科技有限公司 Structured query language data parsing method and device and electronic equipment
CN114676678B (en) * 2022-04-08 2023-10-27 北京百度网讯科技有限公司 Method and device for analyzing structured query language data and electronic equipment
CN115062049B (en) * 2022-07-28 2022-11-18 浙江城云数字科技有限公司 Data blood margin analysis method and device
CN115062049A (en) * 2022-07-28 2022-09-16 浙江城云数字科技有限公司 Data blood margin analysis method and device

Similar Documents

Publication Publication Date Title
CN109325078A (en) Method and device is determined based on the data blood relationship of structured data
US10860548B2 (en) Generating and reusing transformations for evolving schema mapping
US7464084B2 (en) Method for performing an inexact query transformation in a heterogeneous environment
Morton et al. Dynamic workload driven data integration in tableau
US8326857B2 (en) Systems and methods for providing value hierarchies, ragged hierarchies and skip-level hierarchies in a business intelligence server
Romero et al. GEM: Requirement-driven generation of ETL and multidimensional conceptual designs
US9075859B2 (en) Parameterized database drill-through
US11853363B2 (en) Data preparation using semantic roles
US11100098B2 (en) Systems and methods for providing multilingual support for data used with a business intelligence server
US10019507B2 (en) Detection and creation of appropriate row concept during automated model generation
US20070282805A1 (en) Apparatus and method for comparing metadata structures
WO2008042560A2 (en) Apparatus and method for receiving a report
US8862543B2 (en) Synchronizing primary and secondary repositories
WO2016138566A1 (en) A system and method for federated enterprise analysis
US20140365498A1 (en) Finding A Data Item Of A Plurality Of Data Items Stored In A Digital Data Storage
CN111368097A (en) Knowledge graph extraction method and device
US20160364426A1 (en) Maintenance of tags assigned to artifacts
US20220156228A1 (en) Data Tagging And Synchronisation System
KR101062655B1 (en) Metadata Management System Using Tag and Its Method
US9990415B2 (en) Data structure for representing information using expressions
Ruiz et al. Supporting organisational evolution by means of model-driven reengineering frameworks
Grander et al. Relationship Between Big Data and Decision Support Systems
Galliano The importance of data visualization tools in modern enterprises. Cost-effective solutions and empowering of an open source project.
Ren Constructing a business intelligence solution with Microsoft SQL Server 2005
Gogineni et al. Systematic design and implementation of a semantic assistance system for aero-engine design and manufacturing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190212