CN109614432A - A kind of system and method for the acquisition data genetic connection based on syntactic analysis - Google Patents

A kind of system and method for the acquisition data genetic connection based on syntactic analysis Download PDF

Info

Publication number
CN109614432A
CN109614432A CN201811483550.8A CN201811483550A CN109614432A CN 109614432 A CN109614432 A CN 109614432A CN 201811483550 A CN201811483550 A CN 201811483550A CN 109614432 A CN109614432 A CN 109614432A
Authority
CN
China
Prior art keywords
data
information
analysis
genetic connection
primitive operation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811483550.8A
Other languages
Chinese (zh)
Other versions
CN109614432B (en
Inventor
苏萌
刘钰
张凯
姜楠
赵群
赵丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Percent Technology Group Co ltd
Original Assignee
Beijing Baifendian Information Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baifendian Information Science & Technology Co Ltd filed Critical Beijing Baifendian Information Science & Technology Co Ltd
Priority to CN201811483550.8A priority Critical patent/CN109614432B/en
Publication of CN109614432A publication Critical patent/CN109614432A/en
Application granted granted Critical
Publication of CN109614432B publication Critical patent/CN109614432B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The system and method for the acquisition data genetic connection based on syntactic analysis that the invention discloses a kind of, including data genetic connection Analysis server;Data genetic connection Analysis server is mainly by primitive operation MIM message input module, frame analysis module, Lexical Analysis Module, syntax Analysis Module, intermediate result information generation module, data genetic connection logic analysis module and query interface;It further include genetic connection proxy plug-ins.Present system and method scalability are strong, and more efficiently.

Description

A kind of system and method for the acquisition data genetic connection based on syntactic analysis
Technical field
The present invention relates to the data Treatment process fields towards big data, and in particular to a kind of acquisition based on syntactic analysis The system and method for data genetic connection.
Background technique
Data improvement refers to from using the sporadic data to become using unified master data, from little or no tissue and stream Integrated data in Cheng Zhili to enterprise-wide administers, from attempt processing master data confusion situation to master data it is in perfect order one A process.Wherein, the successful key of data improvement is metadata management, then, the genetic connection of data is as metadata pipe A part of reason is very important a link.
Field is administered in data, the genetic connection analysis of data can be inquired into from two levels.In relational data field, There are some traditional industries enterprises, data administer the stage when analyzing the genetic connection of data, use manual analysis data Genetic connection content, and it is aided with the mode of artificial record EXCEL table.This mode not only inefficiency, and it is not expansible Ability.There are also IT enterprises to research and develop related software, such as IBM has developed this data of Infosphere Datastage and controls The software of reason, wherein including the genetic connection function of data;Oracle has developed BI software suite, wherein also including the blood of data Edge relationship function.But these softwares are all client softwares, need to install and deploy at the end PC and the scale of construction is than heavier.
Current genetic connection analysis software is all based on the side of software client for relational data field Formula causes the end PC to need installation and deployment, and genetic connection analysis data are stored in the end PC, it is not easy to extend, can not chase after It traces back.For big data field, data genetic connection analytical technology is that extension is collecting group terminal in the form of plug-in unit;But blood Fate analysis logic is implanted to the normal performance that will affect cluster in cluster, meanwhile, data consanguinity analysis logical AND cluster module meeting Generation business coupling, it has not been convenient to subsequent expansion.
Genetic connection analyzes maximum problem is how to accomplish the profound level of accurate data consanguinity analysis and blood relationship granularity Refinement, is accurate to table for consanguinity analysis, is accurate to field rank.The realization of the solution of existing industry or be by blood relationship point Analysis granularity control table level not or data consanguinity analysis accuracy it is not high, can not achieve the number for complicated SQL relationship According to consanguinity analysis.How the genetic connection analysis for how realizing field granularity level, hold data consanguinity analysis for data blood The complete coverage of edge relationship, is the largest technical problem.
In current industry field, such as the Atlas software of Apache, realize the granularity of the genetic connection analysis of data It holds, still, the functions of modules that its height relies on cluster causes the consanguinity analysis ability coverage of data not high, autonomous control blood Edge analysis ability is limited to.
Summary of the invention
In view of the deficiencies of the prior art, the present invention is intended to provide a kind of acquisition data genetic connection based on syntactic analysis System and method, scalability is strong, and more efficiently.
To achieve the goals above, the present invention adopts the following technical scheme:
A kind of system of the acquisition data genetic connection based on syntactic analysis, including data genetic connection Analysis server;
Data genetic connection Analysis server is mainly by primitive operation MIM message input module, frame analysis module, morphology point Analyse module, syntax Analysis Module, intermediate result information generation module, data genetic connection logic analysis module and query interface;
The primitive operation MIM message input module is used to obtain original behaviour when big data cluster carries out ETL processing to data Make information, the primitive operation information is data interaction linguistic form;
The frame analysis module is used to carry out primitive operation information frame analysis, including carries out to primitive operation information Cutting, classification and the grouping of unit generate frame analysis information;
The Lexical Analysis Module is used to carry out morphological analysis to frame analysis information, generates each lexical unit information;
The syntax Analysis Module is used to carry out syntactic analysis to lexical unit information and frame analysis information, and output is abstract Syntax tree information;
Intermediate result information generation module is used for each information node of ergodic abstract syntax tree, to each information node Type carries out identification analysis, analyzes the key message including database, tables of data, data field, to generate centre Result information;
Data genetic connection logic analysis module is used to combine morphology unit information, abstract syntax tree information and intermediate result Information carries out respective data blood relationship logic analysis for different types of information node and obtains data genetic connection information, wraps Include the data genetic connection letter between data genetic connection information, data field and the data field between tables of data and tables of data Breath;
The query interface is used to that data genetic connection information to be accessed and inquired for exterior terminal.
Further, the data interaction language include standard SQL language, Oracle Sql dialect, SparkSql dialect, Phoenix Sql dialect.
Further, the lexical unit information is stored in number inside the maintenance inside data genetic connection Analysis server According to morphology dictionary structure in.
Further, the intermediate result information is stored in data genetic connection Analysis server in the form of dictionary Portion.
Further, the data genetic connection information obtained is stored in inside data genetic connection Analysis server, or is deposited Storage is in external storage system.
Further, the system also includes the proxy plug-ins that have relationship by blood, the genetic connection proxy plug-ins are deployed in Large data sets group terminal is carried out the primitive operation information of ETL processing formula to data for dynamic acquisition big data cluster, is handed over data Mutual linguistic form is sent to the primitive operation MIM message input module of data genetic connection Analysis server.
Further, primitive operation information is sent to by the genetic connection proxy plug-ins in real-time, asynchronous mode Primitive operation MIM message input module.
Using the method for the above-mentioned acquisition data genetic connection system based on syntactic analysis, include the following steps:
The primitive operation that S1, primitive operation MIM message input module obtain when big data cluster carries out ETL processing to data is believed Breath, the primitive operation information are data interaction linguistic form;
S2, frame analysis module carry out frame analysis to primitive operation information, including carry out unit to primitive operation information Cutting, classification and grouping, generate frame analysis information;
S3, Lexical Analysis Module carry out morphological analysis to frame analysis information, generate each lexical unit information;
S4, syntax Analysis Module carry out syntactic analysis, output abstract syntax to lexical unit information and frame analysis information Set information;
Each information node of S5, intermediate result information generation module ergodic abstract syntax tree, to each information node Type carries out identification analysis, analyzes the key message including database, tables of data, data field, to generate centre Result information;
S6, data genetic connection logic analysis module combination morphology unit information, abstract syntax tree information and intermediate result Information carries out respective data blood relationship logic analysis for different types of information node and obtains data genetic connection information, wraps Include the data genetic connection letter between data genetic connection information, data field and the data field between tables of data and tables of data Breath.
Further, it in step S1, especially by the genetic connection proxy plug-ins for being deployed in large data sets group terminal, dynamically obtains Take big data cluster that data are carried out with the primitive operation information of ETL processing formula, the genetic connection proxy plug-ins are with data interaction Primitive operation information is sent to the primitive operation MIM message input module of data genetic connection Analysis server by linguistic form.
Further, primitive operation information is sent to by the genetic connection proxy plug-ins in real-time, asynchronous mode Primitive operation MIM message input module.
The beneficial effects of the present invention are:
1, since system and method for the present invention are by frame analysis, morphological processing, grammer processing, abstract syntax tree processing And data genetic connection is generated by abstract syntax tree, it is based on data interaction language, therefore there is very high scalability, no Only support standard SQL language, moreover it is possible to support various dialects (such as OracleSql dialect, Spark Sql dialect, Phoenix Sql Dialect);
2, system and method for the invention and the full decoupled conjunction of cluster environment, do not have any influence to the performance of cluster, And it is real-time for analyzing for group service, asynchronous, more efficiently.
Detailed description of the invention
Fig. 1 is the system structure diagram of the embodiment of the present invention 1;
Fig. 2 is the method flow schematic diagram of the embodiment of the present invention 2.
Specific embodiment
Below with reference to attached drawing, the invention will be further described, it should be noted that the present embodiment is with this technology side Premised on case, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to this reality Apply example.
Simplicity of explanation first is made to technical term involved in the present embodiment below.
Genetic connection: refering in particular to the genetic connection of data, indicates that data flow of the data in data governance process, data are closed System.Genetic connection has the genetic connection of table level granularity, the genetic connection of field level granularity.
Embodiment 1
The system for the acquisition data genetic connection based on syntactic analysis that the present embodiment provides a kind of, as shown in Figure 1, including number According to genetic connection Analysis server;
Data genetic connection Analysis server is mainly by primitive operation MIM message input module, frame analysis module, morphology point Analyse module, syntax Analysis Module, intermediate result information generation module, data genetic connection logic analysis module and query interface;
When the primitive operation MIM message input module carries out ETL processing to data for obtaining big data cluster (such as HIVE) Primitive operation information, the primitive operation information be data interaction linguistic form;
Specifically, the data interaction language can also be various dialects, such as Oracle other than standard SQL language Sql dialect, Spark Sql dialect, Phoenix Sql dialect.
The frame analysis module is used to carry out primitive operation information frame analysis, including carries out to primitive operation information Cutting, classification and the grouping of unit generate frame analysis information;
The Lexical Analysis Module is used to carry out morphological analysis to frame analysis information, generates each lexical unit information;
Specifically, the lexical unit information can store inside the maintenance inside data genetic connection Analysis server In the morphology dictionary structure of data, so that it is corresponding with the relationship of lexical unit information to constitute a primitive operation information.
The syntax Analysis Module is used to carry out syntactic analysis to lexical unit information and frame analysis information, and output is abstract Syntax tree information;
The abstract syntax tree information is the tree structure description to primitive operation information, after syntactic analysis, number According to the basic content for flowing to information of data is provided with inside genetic connection Analysis server, although data can't be directly generated Genetic connection is as a result, be still tentatively provided with analyzable basic data.
Intermediate result information generation module is used for each information node of ergodic abstract syntax tree, to each information node Type carries out identification analysis, analyzes the key message including database, tables of data, data field, to generate centre Result information;
Specifically, the intermediate result information can be stored in data genetic connection Analysis server in the form of dictionary Portion.
Data genetic connection logic analysis module is used to combine morphology unit information, abstract syntax tree information and intermediate result Information carries out respective data blood relationship logic analysis for different types of information node and obtains data genetic connection information, wraps Include the data genetic connection letter between data genetic connection information, data field and the data field between tables of data and tables of data Breath.
Specifically, the data genetic connection information obtained can store inside data genetic connection Analysis server, together When also can store in external storage system, the compatibility of the way of output with higher.
The query interface is used to that data genetic connection information to be accessed and inquired for exterior terminal.
Further, the system also includes the proxy plug-ins that have relationship by blood, the genetic connection proxy plug-ins are deployed in Large data sets group terminal, the primitive operation information for carrying out ETL processing formula to data for dynamic acquisition big data cluster (are usually grasped Make sentence), the primitive operation MIM message input module of data genetic connection Analysis server is sent to data interaction linguistic form.
Further, primitive operation information is sent to by the genetic connection proxy plug-ins in real-time, asynchronous mode Primitive operation MIM message input module.
Specifically, the genetic connection proxy plug-ins of exploitation standard SQL language or other dialects be can according to need to realize The extension of language.
It should be noted that genetic connection proxy plug-ins are plug type service routine of the work in large data sets group terminal, It can be extended and be adapted to accordingly for different types of big data cluster, flexibility is high.It completes and big data cluster Adaptation work after, can large data sets group terminal install with dispose genetic connection proxy plug-ins.Big data cluster can be normal The ETL work of data is carried out, genetic connection proxy plug-ins will not generate any influence to original work of cluster.Big data cluster When carrying out the task or work of ETL processing, data processing method can embody the genetic connection flow direction of data, and be deployed in big The genetic connection proxy plug-ins of data set group terminal are in the processing ETL process work of big data cluster, so that it may dynamically get Handle the primitive operation information of data.
Embodiment 2
The acquisition data genetic connection system that the present embodiment provides a kind of using described in embodiment 1 based on syntactic analysis Method, as shown in Fig. 2, including the following steps:
The primitive operation that S1, primitive operation MIM message input module obtain when big data cluster carries out ETL processing to data is believed Breath, the primitive operation information are data interaction linguistic form;
S2, frame analysis module carry out frame analysis to primitive operation information, including carry out unit to primitive operation information Cutting, classification and grouping, generate frame analysis information;
S3, Lexical Analysis Module carry out morphological analysis to frame analysis information, generate each lexical unit information;
S4, syntax Analysis Module carry out syntactic analysis, output abstract syntax to lexical unit information and frame analysis information Set information;
Each information node of S5, intermediate result information generation module ergodic abstract syntax tree, to each information node Type carries out identification analysis, analyzes the key message including database, tables of data, data field, to generate centre Result information;
S6, data genetic connection logic analysis module combination morphology unit information, abstract syntax tree information and intermediate result Information carries out respective data blood relationship logic analysis for different types of information node and obtains data genetic connection information, wraps Include the data genetic connection letter between data genetic connection information, data field and the data field between tables of data and tables of data Breath.
Further, the detailed process of step S6 are as follows:
S6.1, using abstract syntax tree information as input, traversed step by step since the root node of syntax tree until finding most After the leaf node of bottom, go to step S6.2;
Entire analytic process for abstract syntax tree is first to find leaf node from up to down, bottom-up again later Recall analytic process.
S6.2, in conjunction with morphology unit information, the information type of leaf node is sorted out, different type categorization is corresponding Different algorithm process logics;The type include build table type, condition types, input table type, output table type, input it is defeated Field type out.For example, the leaf node for building table type corresponds to its algorithm process logic, the leaf node of condition types corresponds to it Algorithm process logic.
S6.3, the information type categorization results according to the leaf node in step S6.2, You Jianbiao type or output table class The leaf node combination intermediate result information of type is analyzed to obtain the downstream information content of data genetic connection;
S6.4, the information type categorization results according to the leaf node in step S6.2, by the leaf section of input table type Point combines intermediate result information to analyze to obtain the upstream information content of data genetic connection;
S6.5, the information type categorization results according to the leaf node in step S6.2, by input and output field type Leaf node combination intermediate result information is analyzed to obtain the field information content of data genetic connection;
S6.6, by the downstream information content of the data genetic connection generated in step S6.3, S6.4, S6.5, upstream information Content and field information content are carefully and neatly synthesized final data genetic connection information by logic add.
Further, data genetic connection information obtained in step S6 is stored in data genetic connection Analysis server Portion or external storage system, are accessed and are inquired by query interface.
Further, it in step S1, especially by the genetic connection proxy plug-ins for being deployed in large data sets group terminal, dynamically obtains Take big data cluster that data are carried out with the primitive operation information of ETL processing formula, the genetic connection proxy plug-ins are with data interaction Primitive operation information is sent to the primitive operation MIM message input module of data genetic connection Analysis server by linguistic form.
Further, primitive operation information is sent to by the genetic connection proxy plug-ins in real-time, asynchronous mode Primitive operation MIM message input module.
For those skilled in the art, it can be provided various corresponding according to above technical solution and design Change and modification, and all these change and modification, should be construed as being included within the scope of protection of the claims of the present invention.

Claims (10)

1. a kind of system of the acquisition data genetic connection based on syntactic analysis, which is characterized in that including data genetic connection point Analyse server;
Data genetic connection Analysis server is mainly by primitive operation MIM message input module, frame analysis module, morphological analysis mould Block, syntax Analysis Module, intermediate result information generation module, data genetic connection logic analysis module and query interface;
The primitive operation that the primitive operation MIM message input module is used to obtain when big data cluster carries out ETL processing to data is believed Breath, the primitive operation information are data interaction linguistic form;
The frame analysis module is used to carry out primitive operation information frame analysis, including carries out unit to primitive operation information Cutting, classification and grouping, generate frame analysis information;
The Lexical Analysis Module is used to carry out morphological analysis to frame analysis information, generates each lexical unit information;
The syntax Analysis Module is used to carry out syntactic analysis, output abstract syntax to lexical unit information and frame analysis information Set information;
Intermediate result information generation module is used for each information node of ergodic abstract syntax tree, to the type of each information node Identification analysis is carried out, the key message including database, tables of data, data field is analyzed, to generate intermediate result Information;
Data genetic connection logic analysis module is used in conjunction with morphology unit information, abstract syntax tree information and intermediate result letter Breath carries out respective data blood relationship logic analysis for different types of information node and obtains data genetic connection information, including Data genetic connection letter between data genetic connection information, data field and data field between tables of data and tables of data Breath;
The query interface is used to that data genetic connection information to be accessed and inquired for exterior terminal.
2. the system of the acquisition data genetic connection according to claim 1 based on syntactic analysis, which is characterized in that described Data interaction language includes standard SQL language, Oracle Sql dialect, Spark Sql dialect, Phoenix Sql dialect.
3. the system of the acquisition data genetic connection according to claim 1 based on syntactic analysis, which is characterized in that described Lexical unit information is stored in the morphology dictionary structure of the maintenance internal data inside data genetic connection Analysis server.
4. the system of the acquisition data genetic connection according to claim 1 based on syntactic analysis, which is characterized in that described Intermediate result information is stored in inside data genetic connection Analysis server in the form of dictionary.
5. the system of the acquisition data genetic connection according to claim 1 based on syntactic analysis, which is characterized in that obtain Data genetic connection information be stored in inside data genetic connection Analysis server, or be stored in external storage system.
6. the system of the acquisition data genetic connection according to claim 1 based on syntactic analysis, which is characterized in that described System further includes the proxy plug-ins that have relationship by blood, and the genetic connection proxy plug-ins are deployed in large data sets group terminal, for dynamic The primitive operation information that big data cluster carries out ETL processing formula to data is obtained, data are sent to data interaction linguistic form The primitive operation MIM message input module of genetic connection Analysis server.
7. the system of the acquisition data genetic connection according to claim 6 based on syntactic analysis, which is characterized in that described Primitive operation information is sent to primitive operation MIM message input module in real-time, asynchronous mode by genetic connection proxy plug-ins.
8. a kind of method using the acquisition data genetic connection system described in any of the above-described claim based on syntactic analysis, It is characterized by comprising the following steps:
S1, primitive operation MIM message input module obtain primitive operation information when big data cluster carries out ETL processing to data, institute Stating primitive operation information is data interaction linguistic form;
S2, frame analysis module carry out frame analysis to primitive operation information, cut including carrying out unit to primitive operation information Divide, sort out and be grouped, generates frame analysis information;
S3, Lexical Analysis Module carry out morphological analysis to frame analysis information, generate each lexical unit information;
S4, syntax Analysis Module carry out syntactic analysis, output abstract syntax tree letter to lexical unit information and frame analysis information Breath;
Each information node of S5, intermediate result information generation module ergodic abstract syntax tree, to the type of each information node Identification analysis is carried out, the key message including database, tables of data, data field is analyzed, to generate intermediate result Information;
S6, data genetic connection logic analysis module combination morphology unit information, abstract syntax tree information and intermediate result information, Respective data blood relationship logic analysis, which is carried out, for different types of information node obtains data genetic connection information, including data The data genetic connection information between data genetic connection information, data field and data field between table and tables of data.
9. according to the method described in claim 8, it is characterized in that, in step S1, especially by being deployed in large data sets group terminal Genetic connection proxy plug-ins, dynamic acquisition big data cluster to data carry out ETL processing formula primitive operation information, the blood Primitive operation information is sent to data genetic connection Analysis server with data interaction linguistic form by edge relationship proxy plug-ins Primitive operation MIM message input module.
10. according to the method described in claim 9, it is characterized in that, the genetic connection proxy plug-ins are with real-time, asynchronous side Primitive operation information is sent to primitive operation MIM message input module by formula.
CN201811483550.8A 2018-12-05 2018-12-05 System and method for acquiring data blood relationship based on syntactic analysis Active CN109614432B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811483550.8A CN109614432B (en) 2018-12-05 2018-12-05 System and method for acquiring data blood relationship based on syntactic analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811483550.8A CN109614432B (en) 2018-12-05 2018-12-05 System and method for acquiring data blood relationship based on syntactic analysis

Publications (2)

Publication Number Publication Date
CN109614432A true CN109614432A (en) 2019-04-12
CN109614432B CN109614432B (en) 2021-01-05

Family

ID=66007192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811483550.8A Active CN109614432B (en) 2018-12-05 2018-12-05 System and method for acquiring data blood relationship based on syntactic analysis

Country Status (1)

Country Link
CN (1) CN109614432B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083639A (en) * 2019-04-25 2019-08-02 中电科嘉兴新型智慧城市科技发展有限公司 A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110555032A (en) * 2019-09-09 2019-12-10 北京搜狐新媒体信息技术有限公司 Data blood relationship analysis method and system based on metadata
CN111143403A (en) * 2019-12-10 2020-05-12 跬云(上海)信息科技有限公司 SQL conversion method and device and storage medium
CN111143390A (en) * 2019-12-30 2020-05-12 北京每日优鲜电子商务有限公司 Method and device for updating metadata
CN111427997A (en) * 2020-03-09 2020-07-17 北京明略软件系统有限公司 Method and device for displaying blood relationship, computer storage medium and terminal
CN111538744A (en) * 2020-07-08 2020-08-14 浙江大华技术股份有限公司 Method and device for processing data blood margin
CN112860585A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Test script assertion generation method and device
CN112860812A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Information processing method, apparatus, device, storage medium, and program product
CN113326286A (en) * 2021-08-03 2021-08-31 杭州量之智能科技有限公司 Semantic analysis method supporting dialect SQL blood margin analysis
CN116932831A (en) * 2023-09-14 2023-10-24 北京滴普科技有限公司 Method and device for constructing data blood-lineage diagram

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108433A1 (en) * 2012-10-12 2014-04-17 Watson Manwaring Conner Ordered Access Of Interrelated Data Files
CN103902653A (en) * 2014-02-28 2014-07-02 珠海多玩信息技术有限公司 Method and device for creating data warehouse table blood relationship graph
CN104216888A (en) * 2013-05-30 2014-12-17 中国电信股份有限公司 Data processing task relation setting method and system
CN104899314A (en) * 2015-06-17 2015-09-09 北京京东尚科信息技术有限公司 Pedigree analysis method and device of data warehouse
CN107180053A (en) * 2016-03-11 2017-09-19 中国移动通信集团河北有限公司 A kind of data warehouse optimization method and device
CN107545030A (en) * 2017-07-17 2018-01-05 阿里巴巴集团控股有限公司 Processing method, device and the equipment of data genetic connection
CN107644073A (en) * 2017-09-18 2018-01-30 广东中标数据科技股份有限公司 A kind of field consanguinity analysis method, system and device based on depth-first traversal
CN108694195A (en) * 2017-04-10 2018-10-23 腾讯科技(深圳)有限公司 A kind of management method and system of Distributed Data Warehouse

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108433A1 (en) * 2012-10-12 2014-04-17 Watson Manwaring Conner Ordered Access Of Interrelated Data Files
CN104216888A (en) * 2013-05-30 2014-12-17 中国电信股份有限公司 Data processing task relation setting method and system
CN103902653A (en) * 2014-02-28 2014-07-02 珠海多玩信息技术有限公司 Method and device for creating data warehouse table blood relationship graph
CN104899314A (en) * 2015-06-17 2015-09-09 北京京东尚科信息技术有限公司 Pedigree analysis method and device of data warehouse
CN107180053A (en) * 2016-03-11 2017-09-19 中国移动通信集团河北有限公司 A kind of data warehouse optimization method and device
CN108694195A (en) * 2017-04-10 2018-10-23 腾讯科技(深圳)有限公司 A kind of management method and system of Distributed Data Warehouse
CN107545030A (en) * 2017-07-17 2018-01-05 阿里巴巴集团控股有限公司 Processing method, device and the equipment of data genetic connection
CN107644073A (en) * 2017-09-18 2018-01-30 广东中标数据科技股份有限公司 A kind of field consanguinity analysis method, system and device based on depth-first traversal

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083639A (en) * 2019-04-25 2019-08-02 中电科嘉兴新型智慧城市科技发展有限公司 A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
CN110083639B (en) * 2019-04-25 2023-03-10 中电科嘉兴新型智慧城市科技发展有限公司 Intelligent data blood source tracing method and device based on cluster analysis
CN110232056A (en) * 2019-05-21 2019-09-13 苏宁云计算有限公司 A kind of the blood relationship analytic method and its tool of structured query language
CN110232056B (en) * 2019-05-21 2022-02-25 苏宁云计算有限公司 Blood margin analysis method and tool of structured query language
CN110555032A (en) * 2019-09-09 2019-12-10 北京搜狐新媒体信息技术有限公司 Data blood relationship analysis method and system based on metadata
CN111143403B (en) * 2019-12-10 2021-05-14 跬云(上海)信息科技有限公司 SQL conversion method and device and storage medium
CN111143403A (en) * 2019-12-10 2020-05-12 跬云(上海)信息科技有限公司 SQL conversion method and device and storage medium
CN111143390A (en) * 2019-12-30 2020-05-12 北京每日优鲜电子商务有限公司 Method and device for updating metadata
CN111427997A (en) * 2020-03-09 2020-07-17 北京明略软件系统有限公司 Method and device for displaying blood relationship, computer storage medium and terminal
CN111538744A (en) * 2020-07-08 2020-08-14 浙江大华技术股份有限公司 Method and device for processing data blood margin
CN112860812A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Information processing method, apparatus, device, storage medium, and program product
CN112860812B (en) * 2021-02-09 2023-07-11 北京百度网讯科技有限公司 Method and device for non-invasively determining data field level association relation in big data
CN112860585A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Test script assertion generation method and device
CN112860585B (en) * 2021-03-31 2024-01-26 中国工商银行股份有限公司 Test script assertion generation method and device
CN113326286A (en) * 2021-08-03 2021-08-31 杭州量之智能科技有限公司 Semantic analysis method supporting dialect SQL blood margin analysis
CN116932831A (en) * 2023-09-14 2023-10-24 北京滴普科技有限公司 Method and device for constructing data blood-lineage diagram
CN116932831B (en) * 2023-09-14 2023-12-26 北京滴普科技有限公司 Method and device for constructing data blood-lineage diagram

Also Published As

Publication number Publication date
CN109614432B (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN109614432A (en) A kind of system and method for the acquisition data genetic connection based on syntactic analysis
CN105868204B (en) A kind of method and device for converting Oracle scripting language SQL
US7856438B2 (en) Device and method for semantic analysis of documents by construction of n-ary semantic trees
JP5746286B2 (en) High-performance data metatagging and data indexing method and system using a coprocessor
ES2300046T3 (en) VOICE AND TEXT ANALYSIS DEVICE, AND CORRESPONDING PROCEDURE.
CN108874878A (en) A kind of building system and method for knowledge mapping
CN105138864B (en) Protein interactive relation data base construction method based on Biomedical literature
CN112860872A (en) Self-learning-based method and system for verifying semantic compliance of power distribution network operation tickets
DE102013003055A1 (en) Method and apparatus for performing natural language searches
JP6088091B1 (en) Update apparatus, update method, and update program
CN105608232A (en) Bug knowledge modeling method based on graphic database
CN105912594A (en) SQL sentence processing method and system
CN110188359B (en) Text entity extraction method
CN106104524A (en) Complex predicate template collection device and be used for its computer program
CN102402627A (en) System and method for real-time intelligent capturing of article
CN108665141A (en) A method of extracting emergency response procedural model automatically from accident prediction scheme
EP4191484A1 (en) Automatic machine learning data modelling in a low-latency data access and analysis system
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN114528312A (en) Method and device for generating structured query language statement
Bavota et al. The role of artefact corpus in lsi-based traceability recovery
CN108228787A (en) According to the method and apparatus of multistage classification processing information
CN110321556A (en) A kind of method and its system of doctor's diagnosis and treatment medical insurance control expense intelligent recommendation scheme
CN110569372B (en) Construction method of heart disease big data knowledge graph system
Faralli et al. Multiple knowledge graphdb (mkgdb)
CN107273525A (en) Functional expression querying method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100081 No.101, 1st floor, building 14, 27 Jiancai Chengzhong Road, Haidian District, Beijing

Patentee after: Beijing PERCENT Technology Group Co.,Ltd.

Address before: 100081 16 / F, block a, Beichen Century Center, building 2, courtyard 8, Beichen West Road, Chaoyang District, Beijing

Patentee before: BEIJING BAIFENDIAN INFORMATION SCIENCE & TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address