CN110083639A - A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source - Google Patents

A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source Download PDF

Info

Publication number
CN110083639A
CN110083639A CN201910337129.4A CN201910337129A CN110083639A CN 110083639 A CN110083639 A CN 110083639A CN 201910337129 A CN201910337129 A CN 201910337129A CN 110083639 A CN110083639 A CN 110083639A
Authority
CN
China
Prior art keywords
data
field
genetic connection
source
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910337129.4A
Other languages
Chinese (zh)
Other versions
CN110083639B (en
Inventor
王鹏
陈昊
于会游
姜玉峰
滕姿
李栋
杜浩
饶定远
唐丽娜
靳翼
闵圣捷
陈丽婷
童昊
许亚洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Clp Jiaxing New Intelligent City Science And Technology Development Co Ltd
Original Assignee
CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Clp Jiaxing New Intelligent City Science And Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd, Clp Jiaxing New Intelligent City Science And Technology Development Co Ltd filed Critical CLP SECTION HUAYUN INFORMATION TECHNOLOGY Co Ltd
Priority to CN201910337129.4A priority Critical patent/CN110083639B/en
Publication of CN110083639A publication Critical patent/CN110083639A/en
Application granted granted Critical
Publication of CN110083639B publication Critical patent/CN110083639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method of the data blood relationship based on clustering is intelligently traced to the source, comprising steps of step 1, reads table structure and data, form each field data feature by data engineering means;Step 2, as unit of field, field data feature set is characterized, and is learnt using the cluster algorithm in machine learning to data sample;Step 3, the clustering in step 2 is repeated, until finding optimal classes and optimal classification;Step 4, under optimal classification, automatic discrimination is the field possible with genetic connection by the data field in classification together;Step 5, to each genetic connection, according to the sequencing of table creation time pointed by the relationship, infer the direction of the genetic connection, that is, infers which field is source, which field is target, if the field of the genetic connection comes from same table, marking genetic connection is invalid genetic connection;Step 6, according to effective field genetic connection computational chart genetic connection.

Description

A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
Technical field
The invention belongs to big data technical field, in particular to what a kind of data blood relationship based on clustering was intelligently traced to the source Method and device.
Background technique
With the development of big data and machine learning techniques and universal, Data Analysis Software is used, manages and is generated Data volume it is increasing, almost all of data also higher and higher to the degree of dependence of the format of data, content and quantity It requires to carry out data the operations such as various extractions, cleaning, conversion and desensitization before analysis system operation.The complexity of these business Property determine that the process in data handling procedure is more, long flow path, method is complicated.It must can just be sentenced by the backtracking to data blood relationship The confidence level of disconnected data analyzes the influence power of data and error data source is analyzed and handled.Therefore, in big data Data genetic connection chain is established under background becomes the major issue of big data technical support service application and system maintenance.
There are various defects for existing data blood relationship administrative skill.Traditional data blood relationship is complete by way of manual entry At this method low efficiency is at high cost, and error-prone.Moreover, as data frequent progress is handled, it is also necessary to Artificial regeneration constantly is carried out to data blood relationship, the accuracy of maintenance and timeliness are all difficult to ensure.With the increase of data volume, people The method of work maintenance is hard to carry on.
For the defect of manual maintenance data blood relationship, develops and automatically analyzed by the data blood relationship of foundation of data dictionary Technology, according to the matching relationship of field in data dictionary analytical database.But the technical requirements construct complete data word in advance Allusion quotation.The technology is applicable in very much in traditional information management system, but with the development of new types of data analysis business, to data point All intermediate results of analysis chain road, which all construct data dictionary, becomes highly difficult, and cost is excessively high, therefore this method can only adapt to spy Determine business scenario, reusability is poor.Meanwhile even if dictionary building it is very complete, which can only also determine according in data dictionary The data of justice carry out consanguinity analysis, it is difficult to adapt to plurality of application scenes extensively.On the whole, the technical difficulty is big, at high cost, expands Malleability and adaptability are poor, seem unable to do what one wishes in face of the demand of new types of data analysis tool.
There is also some data management softwares, such as Atlas, can record data blood relationship based on database plug-in unit.It is former It manages and determines the data genetic connection that can only record same data store internal, integration across database even integration across database type (such as Data conversion from mysql to oracle) genetic connection tracking be just unable to complete.Also, the method based on database plug-in unit It is required that data genetic connection is recorded while generating data, if record fails when generation data (such as temporary network It is unreachable), then genetic connection can not be obtained again.Meanwhile the performance of software access data can be generated by the way of plug-in unit It influences, the high software of certain requirement of real-time can not be using this kind of plug-in units.Moreover, this method can not processing field grade The granularity of other data blood relationship, the retrospect of data blood relationship is inadequate.
Summary of the invention
The present invention provides a kind of method and device that the data blood relationship based on clustering is intelligently traced to the source, existing to solve The drawbacks of data source tracing method in technology.
One of embodiment of the present invention, a method of the data blood relationship based on clustering is intelligently traced to the source, including following step It is rapid:
Step 1, table structure and data are read, form each field data feature by data engineering means;
Step 2, as unit of field, field data feature set is characterized, using the cluster algorithm in machine learning Data sample is learnt;
Step 3, the clustering in step 2 is repeated, until finding optimal classes and optimal classification;
Step 4, under optimal classification, automatic discrimination is the word possible with genetic connection by the data field in classification together Section;
Step 5, to each genetic connection, according to the sequencing of table creation time pointed by the relationship, inferring should The direction of genetic connection infers which field is source, which field is target, if the field of the genetic connection comes from same Table, then marking genetic connection is invalid genetic connection;
Step 6, according to effective field genetic connection computational chart genetic connection.
Step 7, by checking each table data in relation chain, genetic connection is modified, optimal classes are found in adjustment According to standard.
The beneficial effect comprise that
1. realizing analysis, foundation and the dimension of the data blood relationship to mass data complex process process of full automation Shield.
2. consanguinity analysis of the invention is based on machine learning algorithm, system is not depended in advance to the definition of data, Ke Yishi Answer various data types and business.Meanwhile the consanguinity analysis of this programme is also not dependent on flow chart of data processing, it both can be in data Blood relationship is created simultaneously in treatment process, can analyze historical data also to create blood relationship.
3. the present invention supports field rank and the other data of table level to trace to the source simultaneously.Moreover, due to the engineering of use It practises algorithm and belongs to unsupervised learning, therefore independent of sample data can carry out consanguinity analysis, while can complete again in system The consanguinity analysis accuracy and efficiency of algorithm are improved after classification by manual intervention.
Detailed description of the invention
The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention , feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention Dry embodiment, in which:
Fig. 1 according to embodiments of the present invention one of the data blood relationship based on clustering intelligently trace to the source method flow signal Figure.
Specific embodiment
According to one or more embodiment, as shown in Figure 1, what a kind of data blood relationship based on clustering was intelligently traced to the source Method, comprising steps of
Step 1: reading table structure and data, form each field data feature by data engineering means, concrete mode is such as Under:
Step 1.1: the data characteristics of initial data is parsed into the sample data of structuring, including field type, field Length, field contents mode etc..
Step 1.2: combining feature existing in sample data to form high dimensional feature;
Step 1.3: high dimensional feature being analyzed, new dimension is formed and the influence power of new dimension is ranked up;
Step 1.4: sample data being subjected to dimensionality reduction according to new dimension, is ensuring sample data distortion rate lower than setting value Under the premise of use smallest dimension number;
Step 1.5: the sample data of new dimension is normalized.
Step 2: as unit of field, field data feature set is characterized, using the cluster algorithm in machine learning Data sample is learnt;
Step 3: the clustering being repeated several times in step 2 calculates, and finds optimal classes and optimal classification, specific side Method is as follows:
Step 3.1: it sets classification number to M (M is initially 1, i.e., all data belong to a classification), executes step 2, A corresponding penalty values are obtained, which is the maximum loss value of system;
Step 3.2: will classification number setting N (number that N is initially tables of data subtracts one, i.e., in addition to two most like tables it Outside, remaining each table belongs to an individually classification), step 2 is executed, a corresponding least disadvantage value is obtained;
Step 3.3: by number of classifying be set as step 3.1 and step 3.2 use the arithmetic mean number (M+N) of number of classifying/ 2, step 2 is executed, penalty values T is obtained;
Step 3.5: if the penalty values be greater than target loss value, M=(M+N)/2 is set, and repeat step 3.1 to Step 3.3;
Step 3.6: if the penalty values be less than target loss value, N=(M+N)/2 is set, and repeat step 3.1 to 3.3;
Step 3.7: if the penalty values are approximately equal to target loss value, recording the value is current optimal classes, and record should Subseries is current optimal classification;
Step 4: under optimal classification, automatic discrimination is the word possible with genetic connection by the data field in classification together Section;
Step 5: to each genetic connection, according to the sequencing of table creation time pointed by the relationship, inferring should The direction of genetic connection, i.e. which field are sources, which field is target.If the field of the genetic connection comes from same table, Then marking genetic connection is invalid genetic connection.
Step 6: according to effective field genetic connection computational chart genetic connection, the specific method is as follows:
Step 6.1: an effective field genetic connection is directed to, if not having to appoint directly or indirectly between this two tables Genetic connection, then recording has direct genetic connection between this two tables;
Step 6.2: between all tables for having direct or indirect genetic connection with this two tables there is indirect blood relationship to close System;
Step 6.2: repeating step 6.1 and step 6.2, handle the other blood source relationship of all field levels;
Step 7: for the table genetic connection being inferred to by algorithm, permission manually checks each table data in relation chain, must It can be by being manually modified to final genetic connection when wanting;
Step 8: the blood source relationship crossed through artificial correction adjusts the standard for finding optimal classes according to its method of adjustment, The specific method is as follows:
Step 8.1: for manually will infer that genetic connection was deleted, i.e., there is no genetic connection between two table of manual confirmation , it is appropriate to increase the target loss value for finding optimal classes;
Step 8.2: the genetic connection new for manual creation, i.e., it has relationship by blood, fits between two table of manual confirmation When reduction target loss value.
According to one or more embodiment, a kind of device that the data blood relationship based on clustering is intelligently traced to the source is described Device includes memory;And it is coupled to the processor of the memory, which, which is configured as executing, is stored in described deposit Instruction in reservoir, the processor execute following operation:
Step 1, table structure and data are read, form each field data feature by data engineering means;
Step 2, as unit of field, field data feature set is characterized, using the cluster algorithm in machine learning Data sample is learnt;
Step 3, the clustering in step 2 is repeated, until finding optimal classes and optimal classification;
Step 4, under optimal classification, automatic discrimination is the word possible with genetic connection by the data field in classification together Section;
Step 5, to each genetic connection, according to the sequencing of table creation time pointed by the relationship, inferring should The direction of genetic connection infers which field is source, which field is target, if the field of the genetic connection comes from same Table, then marking genetic connection is invalid genetic connection;
Step 6, according to effective field genetic connection computational chart genetic connection.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.In addition, shown or beg for Opinion mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING of device or unit Or communication connection, it is also possible to electricity, mechanical or other form connections.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.It can select some or all of unit therein according to the actual needs to realize the mesh of the embodiment of the present invention 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, is also possible to two or more units and is integrated in one unit.It is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in various equivalent modifications or replace It changes, these modifications or substitutions should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with right It is required that protection scope subject to.

Claims (6)

1. a kind of method that the data blood relationship based on clustering is intelligently traced to the source, which comprises the following steps:
Step 1, table structure and data are read, form each field data feature by data engineering means;
Step 2, as unit of field, field data feature set is characterized, using the cluster algorithm logarithm in machine learning Learnt according to sample;
Step 3, the clustering in step 2 is repeated, until finding optimal classes and optimal classification;
Step 4, under optimal classification, automatic discrimination is the field possible with genetic connection by the data field in classification together;
Step 5, to each genetic connection, according to the sequencing of table creation time pointed by the relationship, infer the blood relationship The direction of relationship infers which field is source, which field is target, if the field of the genetic connection comes from same table, Then marking genetic connection is invalid genetic connection;
Step 6, according to effective field genetic connection computational chart genetic connection.
2. the method that the data blood relationship according to claim 1 based on clustering is intelligently traced to the source, which is characterized in that into one Step includes,
By checking each table data in relation chain, genetic connection is modified, optimal classes standard is found in adjustment.
3. the method that the data blood relationship according to claim 1 based on clustering is intelligently traced to the source, which is characterized in that described Step 1 further comprises:
Step 1.1, the data characteristics of initial data is parsed into the sample data of structuring, including field type, field length, Field contents mode;
Step 1.2, it combines feature existing in sample data to form high dimensional feature;
Step 1.3, high dimensional feature is analyzed, form new dimension and the influence power of new dimension is ranked up;
Step 1.4, sample data is subjected to dimensionality reduction according to new dimension, ensures premise of the sample data distortion rate lower than setting value It is lower to use smallest dimension number;
Step 1.5: the sample data of new dimension is normalized.
4. the method that the data blood relationship according to claim 3 based on clustering is intelligently traced to the source, which is characterized in that described Step 3 further comprises:
Step 3.1, set M for classification number, execute step 2, obtain a corresponding penalty values, the penalty values be system most Big penalty values;
Step 3.2, N is arranged in classification number, executes step 2, obtains a corresponding least disadvantage value;
Step 3.3, the number that will classify is set as step 3.1 and step 3.2 uses the arithmetic mean number (M+N)/2 of classification number, holds Row step 2 obtains penalty values T;
Step 3.4, if the penalty values are greater than target loss value, M=(M+N)/2 is set, and repeats step 3.1 to step 3.3;
Step 3.5, if the penalty values are less than target loss value, N=(M+N)/2 is set, and repeats step 3.1 to 3.3;
Step 3.6, if the penalty values are approximately equal to target loss value, recording the value is current optimal classes, records this time point Class is current optimal classification.
5. the method that the data blood relationship according to claim 4 based on clustering is intelligently traced to the source, which is characterized in that described Step 6 further comprises:
Step 6.1, for an effective field genetic connection, if not having to appoint direct or indirect blood relationship between this two tables Relationship, then recording has direct genetic connection between this two tables;
Step 6.2, between all tables for having direct or indirect genetic connection with this two tables have indirect genetic connection;
Step 6.3, step 6.1 and step 6.2 are repeated, the other blood source relationship of all field levels is handled.
6. a kind of device that the data blood relationship based on clustering is intelligently traced to the source, which is characterized in that described device includes memory; And
It is coupled to the processor of the memory, which is configured as executing the instruction of storage in the memory, institute It states processor and executes following operation:
Step 1, table structure and data are read, form each field data feature by data engineering means;
Step 2, as unit of field, field data feature set is characterized, using the cluster algorithm logarithm in machine learning Learnt according to sample;
Step 3, the clustering in step 2 is repeated, until finding optimal classes and optimal classification;
Step 4, under optimal classification, automatic discrimination is the field possible with genetic connection by the data field in classification together;
Step 5, to each genetic connection, according to the sequencing of table creation time pointed by the relationship, infer the blood relationship The direction of relationship infers which field is source, which field is target, if the field of the genetic connection comes from same table, Then marking genetic connection is invalid genetic connection;
Step 6, according to effective field genetic connection computational chart genetic connection.
CN201910337129.4A 2019-04-25 2019-04-25 Intelligent data blood source tracing method and device based on cluster analysis Active CN110083639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910337129.4A CN110083639B (en) 2019-04-25 2019-04-25 Intelligent data blood source tracing method and device based on cluster analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910337129.4A CN110083639B (en) 2019-04-25 2019-04-25 Intelligent data blood source tracing method and device based on cluster analysis

Publications (2)

Publication Number Publication Date
CN110083639A true CN110083639A (en) 2019-08-02
CN110083639B CN110083639B (en) 2023-03-10

Family

ID=67416633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910337129.4A Active CN110083639B (en) 2019-04-25 2019-04-25 Intelligent data blood source tracing method and device based on cluster analysis

Country Status (1)

Country Link
CN (1) CN110083639B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457405A (en) * 2019-08-20 2019-11-15 上海观安信息技术股份有限公司 A kind of database audit method based on genetic connection
CN111008192A (en) * 2019-11-14 2020-04-14 泰康保险集团股份有限公司 Data management method, device, equipment and medium
CN111400305A (en) * 2020-02-20 2020-07-10 深圳市魔数智擎人工智能有限公司 Characteristic engineering blood relationship based backtracking and visualization method
CN111563103A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Method and system for detecting data blood margin
CN111627552A (en) * 2020-04-08 2020-09-04 湖南长城医疗科技有限公司 Medical streaming data blood relationship analysis and storage method and device
CN111639143A (en) * 2020-06-05 2020-09-08 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN111861830A (en) * 2020-04-03 2020-10-30 深圳市天彦通信股份有限公司 Information cloud platform
CN112463978A (en) * 2020-11-13 2021-03-09 上海逸迅信息科技有限公司 Method and device for generating data blood relationship
CN112883014A (en) * 2021-03-25 2021-06-01 上海众源网络有限公司 Data backtracking method and device, computer equipment and storage medium
CN113010503A (en) * 2021-03-01 2021-06-22 广州智筑信息技术有限公司 Engineering cost data intelligent analysis method and system based on deep learning
CN115374223A (en) * 2022-06-30 2022-11-22 北京三维天地科技股份有限公司 Intelligent blood relationship identification recommendation method and system based on rules and machine learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059017A1 (en) * 2012-08-22 2014-02-27 Bitvore Corp. Data relationships storage platform
CN107239335A (en) * 2017-06-09 2017-10-10 中国工商银行股份有限公司 The job scheduling system and method for distributed system
CN108228747A (en) * 2017-12-20 2018-06-29 江苏数加数据科技有限责任公司 Data genetic connection visualized graphs system in data improvement
CN109299073A (en) * 2018-10-19 2019-02-01 杭州数梦工场科技有限公司 A kind of generation method, system, electronic equipment and the storage medium of data blood relationship
CN109325078A (en) * 2018-09-18 2019-02-12 拉扎斯网络科技(上海)有限公司 Data blood margin determination method and device based on structural data
CN109614432A (en) * 2018-12-05 2019-04-12 北京百分点信息科技有限公司 A kind of system and method for the acquisition data genetic connection based on syntactic analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140059017A1 (en) * 2012-08-22 2014-02-27 Bitvore Corp. Data relationships storage platform
CN107239335A (en) * 2017-06-09 2017-10-10 中国工商银行股份有限公司 The job scheduling system and method for distributed system
CN108228747A (en) * 2017-12-20 2018-06-29 江苏数加数据科技有限责任公司 Data genetic connection visualized graphs system in data improvement
CN109325078A (en) * 2018-09-18 2019-02-12 拉扎斯网络科技(上海)有限公司 Data blood margin determination method and device based on structural data
CN109299073A (en) * 2018-10-19 2019-02-01 杭州数梦工场科技有限公司 A kind of generation method, system, electronic equipment and the storage medium of data blood relationship
CN109614432A (en) * 2018-12-05 2019-04-12 北京百分点信息科技有限公司 A kind of system and method for the acquisition data genetic connection based on syntactic analysis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
李旭风等: "面向数据字段的血缘关系分析", 《中国金融电脑》 *
衡星辰等: "元数据管理系统在电力企业的研究与实践", 《自动化与仪器仪表》 *
许明陆等: "几种遗传聚类方法对玉米自交系遗传差异性的比较分析", 《南方农业》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457405B (en) * 2019-08-20 2021-09-21 上海观安信息技术股份有限公司 Database auditing method based on blood relationship
CN110457405A (en) * 2019-08-20 2019-11-15 上海观安信息技术股份有限公司 A kind of database audit method based on genetic connection
CN111008192A (en) * 2019-11-14 2020-04-14 泰康保险集团股份有限公司 Data management method, device, equipment and medium
CN111008192B (en) * 2019-11-14 2023-06-02 泰康保险集团股份有限公司 Data management method, device, equipment and medium
CN111400305A (en) * 2020-02-20 2020-07-10 深圳市魔数智擎人工智能有限公司 Characteristic engineering blood relationship based backtracking and visualization method
CN111400305B (en) * 2020-02-20 2022-03-08 深圳市魔数智擎人工智能有限公司 Characteristic engineering blood relationship based backtracking and visualization method
CN111861830B (en) * 2020-04-03 2024-04-26 深圳市天彦通信股份有限公司 Information cloud platform
CN111861830A (en) * 2020-04-03 2020-10-30 深圳市天彦通信股份有限公司 Information cloud platform
CN111627552A (en) * 2020-04-08 2020-09-04 湖南长城医疗科技有限公司 Medical streaming data blood relationship analysis and storage method and device
CN111563103B (en) * 2020-04-28 2022-05-20 厦门市美亚柏科信息股份有限公司 Method and system for detecting data blood relationship
CN111563103A (en) * 2020-04-28 2020-08-21 厦门市美亚柏科信息股份有限公司 Method and system for detecting data blood margin
CN111639143A (en) * 2020-06-05 2020-09-08 广州市玄武无线科技股份有限公司 Data blood relationship display method and device of data warehouse and electronic equipment
CN112463978B (en) * 2020-11-13 2021-07-16 上海逸迅信息科技有限公司 Method and device for generating data blood relationship
CN112463978A (en) * 2020-11-13 2021-03-09 上海逸迅信息科技有限公司 Method and device for generating data blood relationship
CN113010503A (en) * 2021-03-01 2021-06-22 广州智筑信息技术有限公司 Engineering cost data intelligent analysis method and system based on deep learning
CN112883014A (en) * 2021-03-25 2021-06-01 上海众源网络有限公司 Data backtracking method and device, computer equipment and storage medium
CN115374223A (en) * 2022-06-30 2022-11-22 北京三维天地科技股份有限公司 Intelligent blood relationship identification recommendation method and system based on rules and machine learning

Also Published As

Publication number Publication date
CN110083639B (en) 2023-03-10

Similar Documents

Publication Publication Date Title
CN110083639A (en) A kind of method and device that the data blood relationship based on clustering is intelligently traced to the source
CN106649503A (en) Query method and system based on sql
CN107563153A (en) A kind of PacBio microarray dataset IT architectures based on Hadoop structures
CN111552509B (en) Method and device for determining dependency relationship between interfaces
CN104391879A (en) Method and device for hierarchical clustering
CN106126279A (en) Automatically the method and system of interpolation BIM model race fileinfo
CN114020593B (en) Heterogeneous process log sampling method and system based on track clustering
CN116522403A (en) Interactive information desensitization method and server for focusing big data privacy security
CN116881430A (en) Industrial chain identification method and device, electronic equipment and readable storage medium
CN111522705A (en) Intelligent operation and maintenance solution method for industrial big data
CN108763260A (en) Test question searching method and system and terminal equipment
CN108108444B (en) Enterprise business unit self-adaptive system and implementation method thereof
CN111221967A (en) Language data classification storage system based on block chain architecture
CN114359649B (en) Image processing method, apparatus, device, storage medium, and program product
CN112948251B (en) Automatic software testing method and device
CN110874465B (en) Mobile equipment entity identification method and device based on semi-supervised learning algorithm
CN114282598A (en) Multi-source heterogeneous power grid data fusion method, device, equipment and computer medium
CN114648014A (en) Engineering data correlation method based on improved Gaussian mixture model
CN110534158B (en) Gene sequence comparison method, device, server and medium
KR20140006491A (en) Effective graph clustering apparatus and method for probabilistic graph
CN110504004B (en) Complex network structure controllability gene identification method
CN113342518A (en) Task processing method and device
CN112445939A (en) Social network group discovery system, method and storage medium
CN112017790B (en) Electronic medical record screening method, device, equipment and medium based on countermeasure network
CN116451771B (en) Image classification convolutional neural network compression method and core particle device data distribution method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant