CN109656917A - Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source - Google Patents

Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source Download PDF

Info

Publication number
CN109656917A
CN109656917A CN201811551768.2A CN201811551768A CN109656917A CN 109656917 A CN109656917 A CN 109656917A CN 201811551768 A CN201811551768 A CN 201811551768A CN 109656917 A CN109656917 A CN 109656917A
Authority
CN
China
Prior art keywords
data
detection
tested
rule
data source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811551768.2A
Other languages
Chinese (zh)
Inventor
陈华佳
叶家豪
邸帅
卢道和
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN201811551768.2A priority Critical patent/CN109656917A/en
Publication of CN109656917A publication Critical patent/CN109656917A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a kind of data detection method of multi-data source, device, equipment and readable storage medium storing program for executing, the method comprising the steps of: after detecting the detection instruction of at least two data source corresponding datas of detection, loading data source corresponding with the data source according to the detection instruction and drives;The corresponding data to be tested of the detection instruction are read by data source driving, are stored into Spark cluster, and obtain target detection rule corresponding with the data to be tested;By the target detection rule, the data to be tested are detected in the Spark cluster, obtain the testing result of the data to be tested.The present invention supports the Data Detection across data source by Spark cluster, and the configuration of multiple data sources is supported in data detection process.

Description

Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source
Technical field
The present invention relates to big data technical field more particularly to a kind of data detection methods of multi-data source, device, equipment And readable storage medium storing program for executing.
Background technique
In many data handling utilities, Data Detection is a most important link in big data processing business.It is existing Data Detection mainly have Apache Griffin and Ali DataWorks quality of data (DQC), wherein Griffin is one Applied to the open source Data Detection solution in distributed data system, such as in Hadoop, Spark and Storm equal distribution In formula system, Griffin provides a whole set of unified process to define the quality with detection data collection and notify problem at once. Griffin has been deployed in the data system that eBay is core and carries out providing service, provides one group of general function to solve Pain spot in terms of data quality checking.Detection data quality problems are wanted, the following steps: 1. user's registration data money are broadly divided into It produces;2. establishing a data detection model for data assets;3. modeling engine calculates data target automatically;4. by mail or Portal website's data reporting testing result.The DataWorks quality of data is to support the quality testing of a variety of heterogeneous data sources, lead to Know, the one-stop platform of management service.The DataWorks quality of data for monitored object, is supported at present with data set (DataSet) The monitoring of MaxCompute tables of data and DataHub real-time stream, when offline MaxCompute data change, The DataWorks quality of data can verify data, and block production link, to avoid problem data contamination, meanwhile, The DataWorks quality of data provides the management of history check results, in order to Data Quality Analysis and deciding grade and level.
But the data source that Apache Griffin and Ali's DataWorks quality of data are supported is single, and Griffin is current only Support Hive and Kafka, the DataWorks quality of data is only supported the MaxCompute and DataHub of Ali itself, do not supported more The configuration of kind data source.
Summary of the invention
The main purpose of the present invention is to provide a kind of data detection method of multi-data source, device, equipment and readable deposit Storage media, it is intended to solve the technical issues of existing data detection method cannot support the configuration of multiple data sources.
To achieve the above object, the present invention provides a kind of data detection method of multi-data source, the number of the multi-data source According to detection method comprising steps of
When detect detection at least two data source corresponding datas detection instruction after, according to the detection instruction load with The corresponding data source driving of the data source;
The corresponding data to be tested of the detection instruction are read by data source driving, are stored into Spark cluster, And obtain target detection rule corresponding with the data to be tested;
By the target detection rule, the data to be tested are detected in the Spark cluster, are obtained described to be checked The testing result of measured data.
Preferably, the step for obtaining target detection rule corresponding with the data to be tested includes:
Preset Data Detection rule template is obtained, and determines whether to detect the Data Detection rule template and institute State the associated associated instructions of data to be tested;
If detecting the associated instructions, data to be tested Data Detection corresponding with the associated instructions is advised Then template is associated with, and obtains the corresponding target detection rule of the data to be tested.
Preferably, if described detect the associated instructions, the data to be tested are corresponding with the associated instructions The association of Data Detection rule template, the step for obtaining the corresponding target detection rule of the data to be tested includes:
If detecting the associated instructions, it is determined whether detect the corresponding detection of the setting Data Detection rule template The setting of threshold value instructs;
If detecting the setting instruction, the associated instructions corresponding data detection rule are arranged according to setting instruction The then detection threshold value of template, and data to be tested Data Detection rule template corresponding with the associated instructions is associated with, To obtain the corresponding target detection rule of the data to be tested.
Preferably, the step for obtaining target detection rule corresponding with the data to be tested includes:
The corresponding rule script of the detection data to be tested that user writes is obtained, detects whether to receive described in selection Rule script corresponds to the selection instruction of scripting language type;
If receiving the selection instruction, it is determined that the corresponding code encoder of the selection instruction;
The rule script is compiled by the code encoder, to obtain the corresponding target detection of the data to be tested Rule.
Preferably, the corresponding rule script of the detection data to be tested for obtaining user and writing, detects whether to connect After receiving the step of selection rule script corresponds to the selection instruction of scripting language type, further includes:
If not receiving the selection instruction, default code compiler is obtained, is compiled according to the default code compiler The rule script is translated, to obtain the corresponding target detection rule of the data to be tested.
It is preferably, described that the data to be tested are detected in the Spark cluster by the target detection rule, After the step of obtaining the testing result of the data to be tested, further includes:
If determining that the data to be tested are abnormal data according to the testing result, the data to be tested will be carried Corresponding task is determined as that task not can be performed, and executes not executable when the executing instruction of task receiving, and forbids Execute the not executable task.
It is preferably, described that the data to be tested are detected in the Spark cluster by the target detection rule, The step of obtaining the testing result of the data to be tested include:
The corresponding null value rate of the data to be tested is calculated in the Spark cluster, and whether judges the null value rate Greater than default null value rate;
If the null value rate is greater than the default null value rate, the detection knot that the data to be tested are abnormal data is obtained Fruit;
If the null value rate is less than or equal to the default null value rate, obtaining the data to be tested is normal data Testing result.
In addition, to achieve the above object, the present invention also provides a kind of data detection device of multi-data source, the majority evidence The data detection device in source includes:
Record module, for when detect detect at least two data source corresponding datas detection instruction after, according to described Detection instruction loads data source driving corresponding with the data source;
Read module, for reading the corresponding data to be tested of the detection instruction, storage by data source driving Into Spark cluster;
Module is obtained, for obtaining target detection rule corresponding with the data to be tested;
Detection module, for detecting the number to be detected in the Spark cluster by the target detection rule According to obtaining the testing result of the data to be tested.
In addition, to achieve the above object, the present invention also provides a kind of data-detection apparatus of multi-data source, the majority evidence The data-detection apparatus in source includes memory, processor and is stored on the memory and can run on the processor The data detection process of multi-data source, the data detection process of the multi-data source realize institute as above when being executed by the processor The step of data detection method for the multi-data source stated.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium The data detection process of multi-data source is stored on storage medium, the data detection process of the multi-data source is executed by processor The step of data detection method of Shi Shixian multi-data source as described above.
The present invention passes through after detecting the detection instruction of at least two data source corresponding datas of detection, passes through data source pair The corresponding data to be tested of detection instruction are read in the data source driving answered, and are stored into Spark cluster, and obtain and number to be detected According to corresponding target detection rule;Data to be tested are detected in Spark cluster by target detection rule.Pass through Spark collection Group supports the Data Detection across data source, and the configuration of multiple data sources is supported in data detection process.
Detailed description of the invention
Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of the data detection method preferred embodiment of multi-data source of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
As shown in Figure 1, Fig. 1 is the structural schematic diagram for the hardware running environment that the embodiment of the present invention is related to.
It should be noted that Fig. 1 can be the structural representation of the hardware running environment of the data-detection apparatus of multi-data source Figure.The data-detection apparatus of multi-data source of the embodiment of the present invention can be PC, the terminal devices such as portable computer.
As shown in Figure 1, the data-detection apparatus of the multi-data source may include: processor 1001, such as CPU, network is connect Mouth 1004, user interface 1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 is for realizing these components Between connection communication.User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user interface 1003 can also include standard wireline interface and wireless interface.Network interface 1004 is optional May include standard wireline interface and wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, It is also possible to stable memory (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally may be used also To be independently of the storage device of aforementioned processor 1001.
It will be understood by those skilled in the art that the data-detection apparatus structure of multi-data source shown in Fig. 1 is not constituted Restriction to the data-detection apparatus of multi-data source may include than illustrating more or fewer components, or the certain portions of combination Part or different component layouts.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium Believe the data detection process of module, Subscriber Interface Module SIM and multi-data source.Wherein, operating system is to manage and control most evidences The program of the data-detection apparatus hardware and software resource in source, support multi-data source data detection process and other softwares or The operation of program.
In the data-detection apparatus of multi-data source shown in Fig. 1, user interface 1003 can be used for receiving detection instruction and/ Or associated instructions etc.;Network interface 1004 is mainly used for connecting background server, carries out data communication with background server;And locate Reason device 1001 can be used for calling the data detection process of the multi-data source stored in memory 1005, and execute following operation:
When detect detection at least two data source corresponding datas detection instruction after, according to the detection instruction load with The corresponding data source driving of the data source;
The corresponding data to be tested of the detection instruction are read by data source driving, are stored into Spark cluster, And obtain target detection rule corresponding with the data to be tested;
By the target detection rule, the data to be tested are detected in the Spark cluster, are obtained described to be checked The testing result of measured data.
Further, the step for obtaining target detection rule corresponding with the data to be tested includes:
Preset Data Detection rule template is obtained, and determines whether to detect the Data Detection rule template and institute State the associated associated instructions of data to be tested;
If detecting the associated instructions, data to be tested Data Detection corresponding with the associated instructions is advised Then template is associated with, and obtains the corresponding target detection rule of the data to be tested.
Further, if described detect the associated instructions, by the data to be tested and the associated instructions pair The Data Detection rule template association answered, the step for obtaining the corresponding target detection rule of the data to be tested include:
If detecting the associated instructions, it is determined whether detect the corresponding detection of the setting Data Detection rule template The setting of threshold value instructs;
If detecting the setting instruction, the associated instructions corresponding data detection rule are arranged according to setting instruction The then detection threshold value of template, and data to be tested Data Detection rule template corresponding with the associated instructions is associated with, To obtain the corresponding target detection rule of the data to be tested.
Further, the step for obtaining target detection rule corresponding with the data to be tested includes:
The corresponding rule script of the detection data to be tested that user writes is obtained, detects whether to receive described in selection Rule script corresponds to the selection instruction of scripting language type;
If receiving the selection instruction, it is determined that the corresponding code encoder of the selection instruction;
The rule script is compiled by the code encoder, to obtain the corresponding target detection of the data to be tested Rule.
Further, the corresponding rule script of the detection data to be tested for obtaining user and writing, detects whether After receiving the step of selection rule script corresponds to the selection instruction of scripting language type, processor 1001 can also be used The data detection process of the multi-data source based on block chain stored in calling memory 1005, and execute following steps:
If not receiving the selection instruction, default code compiler is obtained, is compiled according to the default code compiler The rule script is translated, to obtain the corresponding target detection rule of the data to be tested.
Further, described by the target detection rule, the number to be detected is detected in the Spark cluster According to after the step of obtaining the testing result of the data to be tested, processor 1001 can be also used for calling memory 1005 The data detection process of the multi-data source based on block chain of middle storage, and execute following steps:
If determining that the data to be tested are abnormal data according to the testing result, the data to be tested will be carried Corresponding task is determined as that task not can be performed, and executes not executable when the executing instruction of task receiving, and forbids Execute the not executable task.
Further, described by the target detection rule, the number to be detected is detected in the Spark cluster Include: according to, the step of obtaining the testing result of the data to be tested
The corresponding null value rate of the data to be tested is calculated in the Spark cluster, and whether judges the null value rate Greater than default null value rate;
If the null value rate is greater than the default null value rate, the detection knot that the data to be tested are abnormal data is obtained Fruit;
If the null value rate is less than or equal to the default null value rate, obtaining the data to be tested is normal data Testing result.
Based on above-mentioned structure, each embodiment of the data detection method of multi-data source is proposed.
It is the flow diagram of the data detection method first embodiment of multi-data source of the present invention referring to Fig. 2, Fig. 2.
The embodiment of the invention provides the embodiments of the data detection method of multi-data source, it should be noted that although Logical order is shown in flow chart, but in some cases, it can be to be different from shown by sequence execution herein or retouch The step of stating.
Firstly, being explained to the professional term used required for the embodiment of the present invention.
1. Hadoop: being the software frame that can carry out distributed treatment to mass data.Hadoop includes Common, HDFS (Hadoop Distributed File System, distributed file system), YARN (Yet Another Resource Negotiator, another resource coordination person) and tetra- modules of MapReduce, wherein Common: being that can support The public tool of other modules;HDFS is for providing the distributed file system of high access performance of handling up;YARN is to provide work The frame of industry scheduling sum aggregate group resource management;MapReduce is data parallel frame, abbreviation MR.
2. Spark: being a kind of distributed computing framework for carrying out data calculating using memory as far as possible.Spark includes Tetra- SparkSQL, Spark Streaming, GraphX and MLlib modules.Wherein, SparkSQL: a kind of compatible Hive, system The structured data query analysis module of one data access interface, data access interface include but is not limited to Avro, Parquet, JSON (JavaScript Object Notation, JavaScript object numbered musical notation) and JDBC (Java DataBase Connectivity, java database connection).Wherein, Spark Streaming be a kind of streaming quasi real time Computational frame, GraphX are the figure Computational frames of spark, and MLlib is an expansible spark machine learning library, by general Learning algorithm and tool composition.
3. Hive: being a Tool for Data Warehouse based on Hadoop, the data file of structuring can be mapped as one Database table is opened, and SQL query function is provided, SQL statement can be converted to MapReduce task and run.Hive pairs The management of data warehouse includes two aspects: first is that the management of metadata, second is that the management of data.
Wherein, metadata: Hive is stored metadata in relevant database, is such as stored to relevant database management In system MySQL.Metadata in Hive include the name of table, the column of table and subregion and its attribute, table attribute (whether be outer Portion's table etc.) and the data place HDFS storage catalogue of table etc.;
Data: the data of Hive are stored in HDFS, and most inquiry is completed by MapReduce task computation.
The data detection method of multi-data source is applied in server or terminal, and terminal may include such as mobile phone, put down Plate computer, laptop, palm PC, personal digital assistant (PersonalDigital Assistant, PDA) etc. are mobile The fixed terminals such as terminal, and number TV, desktop computer.In each embodiment of the data detection method of multi-data source In, for ease of description, omits executing subject and be illustrated each embodiment.The data detection method of multi-data source includes:
Step S10 refers to after detecting the detection instruction of at least two data source corresponding datas of detection according to the detection It enables and loads data source driving corresponding with the data source.
After detecting the detection instruction of at least two data source corresponding datas of detection, according to detection instruction load and the number According to the corresponding data source driving in source.Wherein, detection instruction can be triggered as needed by user, can also be according to pre-setting Timed task triggering.In embodiments of the present invention, data source includes but is not limited to SQL (Structured Query Language, structured query language), hive, Java, crawler Python, Scala, R language crawler and JDBC.Each number All there is corresponding data source driving according to source, the corresponding data of each data source can be got by data source driving.Such as SQL data are obtained, then need to obtain SQL data by loading data source driving corresponding with SQL.It should be noted that The corresponding data source driving of each data source is to pre-set, and is stored in the specific position of data-detection apparatus, works as needs When, from specific position load data source driving.Detection instruction can be the data of one data source of detection, can also be Detect the data of multiple data sources.
Step S20 reads the corresponding data to be tested of the detection instruction by data source driving, store to In Spark cluster, and obtain target detection rule corresponding with the data to be tested.
After loading data source corresponding with each data source driving, corresponding data are drivingly connected by the data source Source to read the corresponding data to be tested of detection instruction in corresponding data source, and read data to be tested is stored Into Spark cluster, and Data Detection rule template corresponding with data to be tested is obtained, that is, obtains detection data to be tested Data Detection rule template.In embodiments of the present invention, the Data Detection rule template that will test data to be tested is denoted as mesh Mark detected rule.Wherein, the corresponding number to be detected of each data source can be read in Hadoop cluster by data source driving According to being stored with the data of different data sources in Hadoop cluster;It can also be corresponding in each data source by data source driving The corresponding data to be tested of each data source are read in database.Target detection rule can be pre-set, can also for Need user according to specific needs and be arranged.It should be noted that being determined by experiment test, the Spark SQL in Spark Support a variety of different data sources by DataFrame interface operation, i.e. Spark SQL supports to pass through DataFrame interface It is connect with each data source.DataFrame, which is provided, to be supported unified interface load and saves the data in data source, including but not It is limited to structural data, Parquet file, JSON file, Hive table, and external data source is connected by JDBC.Specifically, It is to store read data to be tested to the RDD of Spark (Resilient Distributed Dataset, elasticity distribution Formula data set) in.DataFrame is the data set for being organized into name column, it is conceptually equal in relational database Table or R/Python in data framework, but it have passed through optimization.DataFrames can be constructed from various sources DataFrames, source include: structured data file, the table in Hive, external data base or existing RDD.RDD is in Spark Most basic data abstraction, it represent one it is immutable, can subregion, the inside element can parallel computation set.
Further, the step for obtaining target detection rule corresponding with the data to be tested includes:
Step a obtains preset Data Detection rule template, and determines whether to detect the Data Detection rule mould Plate and the associated associated instructions of the data to be tested.
Specifically, the process for obtaining target detection rule corresponding with data to be tested can are as follows: obtains preset data inspection Rule template is surveyed, and determines whether detecting by Data Detection rule template and the associated associated instructions of data to be tested.Wherein, number It is pre-set according to detected rule template, i.e. Data Detection rule template writes in advance.Such as some Data Detection Rule template may be configured as detection certain field data, and whether number forms, if detecting, containing not in the field data is number Character, it is determined that the field data be abnormal data;If detecting, all characters are all number, the i.e. word in the field data Segment data is pure digi-tal, it is determined that the field data is normal data;As data detected rule template may be arranged as: detection Whether the character number of certain field data is equal to preset quantity, if detecting, the character number of the field data is equal to present count Amount, it is determined that the field data is normal data, and the character number of the field data is more than or less than present count if detecting Amount, it is determined that the field data is abnormal data.Preset quantity can be arranged according to the characteristic of data to be tested.
Step b, if detecting the associated instructions, by data to be tested data corresponding with the associated instructions The association of detected rule template obtains the corresponding target detection rule of the data to be tested.
If detecting associated instructions, data to be tested Data Detection rule template corresponding with associated instructions is associated with, Obtain the corresponding target detection rule of data to be tested.It is understood that with the associated Data Detection rule of data to be tested Template is target detection rule.Wherein, associated instructions can be the associated instructions that user triggers manually, at this point, user can basis The Data Detection rule template and the data to be tested that pre-set are associated, in order to be detected by the needs of oneself The target detection rule of data to be tested.Associated instructions can also be the automatic trigger after detecting detection instruction, at this point, can root It is automatically that data to be tested are associated with corresponding Data Detection rule template according to the type of data to be tested according to the associated instructions. Such as when determining data to be tested is ID card No., by the corresponding Data Detection rule mould of the ID card No. pre-set Plate is associated with data to be tested.In the present embodiment, the Data Detection rule template pre-set as a rule template, User configures according to specific needs.
, can also be to be multiple it should be noted that the corresponding Data Detection rule template of associated instructions can be one, i.e. template Detected rule can be one, can also be multiple.When there are multiple target detection rules, some data to be tested needs while leading to The detection for crossing all target detection rules just can determine that the data to be tested are normal data.
In addition, in order to improve the efficiency of data quality checking and flexibility, it is described " to obtain corresponding with the data to be tested Target detection rule " it is optional are as follows: according to the mapping relations of the data to be tested and preset target detection rule, acquisition institute State the corresponding target detection rule of data to be tested.That is, in advance by data to be tested and preset Data Detection rule template into Row association, to obtain target detection rule, later, when needing to obtain target detection rule, directly according to data to be tested and mesh The mapping relations for marking detected rule obtain the corresponding target detection rule of the data to be tested.
Further, for the ease of the subsequent alarm for realizing data quality checking result, step b includes:
Step b1, if detecting the associated instructions, it is determined whether detect the setting Data Detection rule template The setting instruction of corresponding detection threshold value.
Step b2 is arranged the associated instructions according to setting instruction and corresponds to number if detecting the setting instruction According to the detection threshold value of detected rule template, and by data to be tested Data Detection rule mould corresponding with the associated instructions Plate association, to obtain the corresponding target detection rule of the data to be tested.
If detecting associated instructions, it is determined whether detect setting Data Detection rule template and correspond to setting for detection threshold value Set instruction.Wherein, setting instruction is that user triggers as needed.If detecting setting instruction, instructed according to setting The detection threshold value of associated instructions corresponding data detected rule template is set, and by the corresponding data to be tested of associated instructions be associated with Corresponding Data Detection rule template is instructed to be associated with, to obtain the corresponding target detection rule of data to be tested.It needs to illustrate It is that detection threshold value is the detection threshold value of corresponding target detection rule, as described below default null value rate.When there are multiple mesh When marking detected rule, the regular corresponding detection threshold value of each target detection can be equal, can also be unequal.Further, if Setting instruction, optional acquisition default threshold, then by data to be tested Data Detection rule corresponding with associated instructions are not detected Template association, using the default threshold as the detection threshold value of corresponding target detection rule, in addition, if not detecting setting instruction, It is optional to be not provided with default threshold, in this case directly by data to be tested Data Detection rule template corresponding with associated instructions Association.By the way that threshold value, the subsequent alarm that testing result may be implemented is arranged.
Step S30 is detected the data to be tested in the Spark cluster, is obtained by the target detection rule The testing result of the data to be tested.
After obtaining target detection rule, by target detection rule, data to be tested is detected in Spark cluster, are obtained To the testing result of data to be tested.Specifically, data to be tested are detected in the RDD of Spark cluster.Testing result include to Detection data is abnormal data and data to be tested are normal data.Specifically, if target detection rule is null value rate detection rule Then, then step S30 includes:
Step c calculates the corresponding null value rate of the data to be tested in the Spark cluster, and judges the null value Whether rate is greater than default null value rate.
Step d, if the null value rate is greater than the default null value rate, obtaining the data to be tested is abnormal data Testing result.
Step e obtains the data to be tested and is positive if the null value rate is less than or equal to the default null value rate The testing result of regular data.
That is, calculating the corresponding null value rate of data to be tested in Spark cluster, and judge whether null value rate is greater than default sky Value rate.If it is determined that null value rate is greater than default null value rate, it is determined that data to be tested are abnormal data, and it is different for obtaining data to be tested The testing result of regular data;If it is determined that null value rate is less than or equal to default null value rate, it is determined that data to be tested are normal number According to, obtain data to be tested be normal data testing result.If data to be tested are a field in some table, the word Duan Zhongyi co-exists in 100 data, and presetting null value rate is 0.1, then detect in data to be tested exist 20 data be it is empty, then Null value rate is 20 ÷ 100=0.2, at this time, it may be determined that data to be tested are abnormal data.
In addition, the step S30 further include: corresponding the second data source is written in the data to be tested of the first data source In data base procedure, the corresponding database of the first data source is denoted as source database, the corresponding database of the second data source is denoted as Target database calculates the data volume of the data to be tested in source database, is denoted as the first data volume, and calculates will be to be checked After target database is written in measured data, the data volume of data to be tested in target database is denoted as the second data volume, compares first Data volume and the second data volume, if the first data volume and the second data volume differ, it is determined that data to be tested are abnormal data, if First data volume is equal with the second data volume, it is determined that data to be tested are normal data.
The present embodiment passes through after detecting the detection instruction of at least two data source corresponding datas of detection, passes through data source The driving of corresponding data source is read the corresponding data to be tested of detection instruction and is stored into Spark cluster, and obtain with it is to be detected The corresponding target detection rule of data, detects data to be tested in Spark cluster by target detection rule.Pass through Spark Cluster supports the Data Detection across data source, and the configuration of multiple data sources is supported in data detection process.
Further, the data detection method of the multi-data source further include:
Step f will be carried described to be checked if determining that the data to be tested are abnormal data according to the testing result The corresponding task of measured data is determined as that task not can be performed, and executes executing instruction for the not executable task receiving When, forbid executing the not executable task.
Further, if determining according to testing result, data to be tested are abnormal data, will carry data to be tested pair Answering for task is determined as that task not can be performed, and when executing instruction of task not can be performed receiving to execute, and forbids executing and be somebody's turn to do Task not can be performed, avoid execution carrying data to be tested from corresponding to the resulting result of task and do not meet actual conditions, or carry Data to be tested correspond to task execution failure, lead to the wasting of resources, guarantee the integrality and correctness of data by Data Detection. Further, not executable when the executing instruction of task of execution is being received, output prompt information prompt user executes instruction Corresponding task carries abnormal data, for task not can be performed, carries abnormal data to inform that user executes instruction in corresponding task.
Further, the data detection method second embodiment of multi-data source of the present invention is proposed.
The data detection method second embodiment of the multi-data source and the data detection method first of the multi-data source The difference of embodiment is that the step for obtaining target detection rule corresponding with the data to be tested includes:
Step g obtains the corresponding rule script of the detection data to be tested that user writes, detects whether to receive choosing Select the selection instruction that the rule script corresponds to scripting language type.
The corresponding rule script of detection data to be tested that user successfully writes is obtained, and detects whether to receive selection rule Then script corresponds to the selection instruction of scripting language type.Wherein, user can server or the corresponding detected rule of terminal from Define page redaction rule script.Selection instruction is that user triggers as needed.In the customized page of detected rule, use Language redaction rule script known to oneself can be used in family, such as user can be used SQL, Python or Java language and write rule Then script, corresponding, scripting language type includes but is not limited to SQL script, Python script or java script.In the present invention In embodiment, after user writes detection data to be tested corresponding rule script, need in corresponding server or terminal Middle submission rule script, user need to select scripting language type during submitting rule script, if what user submitted The rule script that sql like language is write, user should triggering selection SQL script selection instruction;If what user improved is The rule script that Python is write, user answer the selection instruction of triggering selection Python script.In embodiments of the present invention, After user submits rule script, the regulation engine in server or terminal understands resolution rules script, the acquisition of a line a line Rule script.
Step h, if receiving the selection instruction, it is determined that the corresponding code encoder of the selection instruction.
Step i compiles the rule script by the code encoder, to obtain the corresponding mesh of the data to be tested Mark detected rule.
If receiving the selection instruction that selection rule script corresponds to scripting language type, it is determined that selection instruction corresponding generation Code compiler, and the rule script acquired by code encoder compiling, to obtain the corresponding target detection of data to be tested Rule.It should be noted that the code encoder packing of various language is deployed in server or terminal by regulation engine In, packing is compiled to rule script by the code encoder, an executable mission script is generated, that is, generates to be checked The corresponding target detection rule of measured data.If user has submitted the rule script that Java code is write, regulation engine can be selected pair It answers JDK (Java Development Kit) to be compiled, generates corresponding class file and be packaged into jar packet, the jar packet The as corresponding jar packet of target detection rule.
Further, the data detection method of the multi-data source further include:
Step j obtains default code compiler if not receiving the selection instruction, is compiled according to the default code It translates device and compiles the rule script, to obtain the corresponding target detection rule of the data to be tested.
Further, if not receiving the selection instruction of selection scripting language type, default code compiler, root are obtained Rule script is compiled according to default code compiler, to obtain the corresponding target detection rule of data to be tested.Wherein, default code Compiler is arranged according to specific needs, such as the corresponding code encoder of SQL script can be determined as default code compiler, The code encoder to make number one can also be determined as default code compiler.
Further, if compiling the compiling result obtained after rule script according to default code compiler cannot execute, i.e., Target detection rule cannot execute, then prompt information prompt user's compile error is generated, in order to allow user to replace used in compiling Code encoder.
The corresponding rule script of detection data to be tested that the embodiment of the present invention is write by obtaining user, if receiving choosing Select the selection instruction that rule script corresponds to scripting language type, it is determined that code encoder corresponding with selection instruction passes through institute It states code encoder and compiles the rule script, the corresponding target detection rule of data to be tested is obtained, relative to existing (Apache Griffin can only use customized DSL (Domain to Apache Griffin and the DataWorks quality of data Specified Language, Domain Specific Language) language;The DataWorks quality of data is only supported to detect based on available data Rule template is configured, and user can not write oneself desired Data Detection rule template), the present invention supports user using not The corresponding rule script of Data Detection rule template is write with language, user can be used oneself known machine language and write oneself The rule script needed realizes multilingual Data Detection rule template configuration.
Further, after determining target detection rule, that is, after generating the rule script that can be executed in Spark cluster, It can submit service that target detection rule is committed in Spark cluster by the unified operation pre-set in big data platform It executes.Wherein, it is that the task submission pre-set in big data platform executes service that service is submitted in unified operation.
In addition, the embodiment of the present invention also proposes a kind of data detection device of multi-data source, the data of the multi-data source Detection device includes:
Record module, for when detect detect at least two data source corresponding datas detection instruction after, according to described Detection instruction loads data source driving corresponding with the data source;
Read module, for reading the corresponding data to be tested of the detection instruction, storage by data source driving Into Spark cluster;
Module is obtained, for obtaining target detection rule corresponding with the data to be tested;
Detection module, for detecting the number to be detected in the Spark cluster by the target detection rule According to obtaining the testing result of the data to be tested.
Further, obtaining module includes:
First acquisition unit, for obtaining preset Data Detection rule template;
First detection unit, for detecting whether detecting the Data Detection rule template and the data to be tested Associated associated instructions;
Associative cell, if for detecting the associated instructions, by the data to be tested and the associated instructions pair The Data Detection rule template association answered obtains the corresponding target detection rule of the data to be tested.
Further, associative cell includes:
Detection sub-unit, if for detecting the associated instructions, it is determined whether detect the setting Data Detection Rule template corresponds to the setting instruction of detection threshold value;
Subelement is set, if being referred to for detecting the setting instruction according to the setting instruction setting association Enable the detection threshold value of corresponding data detected rule template;
It is associated with subelement, for closing data to be tested Data Detection rule template corresponding with the associated instructions Connection, to obtain the corresponding target detection rule of the data to be tested.
Further, module is obtained further include:
Second acquisition unit, for obtaining the corresponding rule script of the detection data to be tested that user writes;
Second detection unit, for detecting whether receiving the selection for selecting the rule script to correspond to scripting language type Instruction;
First determination unit, if for receiving the selection instruction, it is determined that the corresponding code of the selection instruction is compiled Translate device;
Compilation unit, for compiling the rule script by the code encoder, to obtain the data to be tested Corresponding target detection rule.
Further, if the second acquisition unit obtains default code volume for not receiving the selection instruction Translate device;
The compilation unit is used to compile the rule script according to the default code compiler, described to be checked to obtain The corresponding target detection rule of measured data.
Further, the data detection device of the multi-data source further include:
Determining module, if will be carried for determining that the data to be tested are abnormal data according to the testing result The corresponding task of the data to be tested is determined as that task not can be performed;
Forbid execution module, for receiving execution not executable when the executing instruction of task, forbids executing institute Task cannot not be stated executablely.
Further, the detection module includes:
Computing unit, for calculating the corresponding null value rate of the data to be tested in the Spark cluster;
Judging unit, for judging whether the null value rate is greater than default null value rate;
Second determination unit obtains the data to be tested if being greater than the default null value rate for the null value rate For the testing result of abnormal data;If the null value rate is less than or equal to the default null value rate, obtain described to be detected Data are the testing result of normal data.
The data detection device specific embodiment of multi-data source of the present invention and the data detection method of above-mentioned multi-data source Each embodiment is essentially identical, and details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with the data detection process of multi-data source, realized such as when the data detection process of the multi-data source is executed by processor The step of data detection method of the upper multi-data source.
Each reality of data detection method of computer readable storage medium specific embodiment of the present invention and above-mentioned multi-data source It is essentially identical to apply example, details are not described herein.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (16)

1. a kind of data detection method of multi-data source, which is characterized in that the data detection method of the multi-data source include with Lower step:
When detect detection at least two data source corresponding datas detection instruction after, according to the detection instruction load with it is described The corresponding data source driving of data source;
The corresponding data to be tested of the detection instruction are read by data source driving, are stored into Spark cluster, and obtain Take target detection rule corresponding with the data to be tested;
By the target detection rule, the data to be tested are detected in the Spark cluster, obtain the number to be detected According to testing result.
2. the data detection method of multi-data source as described in claim 1, which is characterized in that it is described acquisition with it is described to be detected The step of the corresponding target detection rule of data includes:
Obtain preset Data Detection rule template, and determine whether to detect by the Data Detection rule template and it is described to The associated associated instructions of detection data;
If detecting the associated instructions, by data to be tested Data Detection rule mould corresponding with the associated instructions Plate association obtains the corresponding target detection rule of the data to be tested.
3. the data detection method of multi-data source as claimed in claim 2, which is characterized in that if described detect the association Data to be tested Data Detection rule template corresponding with the associated instructions is then associated with, obtains described to be checked by instruction The step of the corresponding target detection rule of measured data includes:
If detecting the associated instructions, it is determined whether detect the setting Data Detection rule template and correspond to detection threshold value Setting instruction;
If detecting the setting instruction, the associated instructions corresponding data detected rule mould is arranged according to setting instruction The detection threshold value of plate, and data to be tested Data Detection rule template corresponding with the associated instructions is associated with, with To the corresponding target detection rule of the data to be tested.
4. the data detection method of multi-data source as described in claim 1, which is characterized in that it is described acquisition with it is described to be detected The step of the corresponding target detection rule of data includes:
The corresponding rule script of the detection data to be tested that user writes is obtained, detects whether to receive the selection rule Script corresponds to the selection instruction of scripting language type;
If receiving the selection instruction, it is determined that the corresponding code encoder of the selection instruction;
The rule script is compiled by the code encoder, to obtain the corresponding target detection rule of the data to be tested Then.
5. the data detection method of multi-data source as claimed in claim 4, which is characterized in that the inspection for obtaining user and writing The corresponding rule script of the data to be tested is surveyed, detects whether that receiving the selection rule script corresponds to scripting language type Selection instruction the step of after, further includes:
If not receiving the selection instruction, default code compiler is obtained, institute is compiled according to the default code compiler Rule script is stated, to obtain the corresponding target detection rule of the data to be tested.
6. the data detection method of multi-data source as described in claim 1, which is characterized in that described to pass through the target detection Rule, the step of detecting the data to be tested in the Spark cluster, obtain the testing result of the data to be tested it Afterwards, further includes:
If determining that the data to be tested are abnormal data according to the testing result, it is corresponding that the data to be tested will be carried Task be determined as that task not can be performed, and receiving execute it is described when executing instruction of task not can be performed, forbid executing It is described that task not can be performed.
7. such as the data detection method of multi-data source as claimed in any one of claims 1 to 6, which is characterized in that described to pass through institute Target detection rule is stated, the data to be tested is detected in the Spark cluster, obtains the detection knot of the data to be tested The step of fruit includes:
The corresponding null value rate of the data to be tested is calculated in the Spark cluster, and judges whether the null value rate is greater than Default null value rate;
If the null value rate is greater than the default null value rate, the testing result that the data to be tested are abnormal data is obtained;
If the null value rate is less than or equal to the default null value rate, the inspection that the data to be tested are normal data is obtained Survey result.
8. a kind of data detection device of multi-data source, which is characterized in that the data detection device of the multi-data source includes:
Record module, for when detect detect at least two data source corresponding datas detection instruction after, according to the detection Instruction loads data source driving corresponding with the data source;
Read module, for by the corresponding data to be tested of the data source driving reading detection instruction, store to In Spark cluster;
Module is obtained, for obtaining target detection rule corresponding with the data to be tested;
Detection module, for detecting the data to be tested in the Spark cluster, obtaining by the target detection rule To the testing result of the data to be tested.
9. the data detection device of multi-data source as claimed in claim 8, which is characterized in that the acquisition module includes:
First acquisition unit, for obtaining preset Data Detection rule template;
The Data Detection rule template is associated with by first detection unit for detecting whether detecting with the data to be tested Associated instructions;
Associative cell, if for detecting the associated instructions, the data to be tested are corresponding with the associated instructions The association of Data Detection rule template obtains the corresponding target detection rule of the data to be tested.
10. the data detection device of multi-data source as claimed in claim 9, which is characterized in that the associative cell includes:
Detection sub-unit, if for detecting the associated instructions, it is determined whether detect the setting Data Detection rule Template corresponds to the setting instruction of detection threshold value;
Subelement is set, if the associated instructions pair are arranged according to setting instruction for detecting the setting instruction Answer the detection threshold value of Data Detection rule template;
It is associated with subelement, for data to be tested Data Detection rule template corresponding with the associated instructions to be associated with, To obtain the corresponding target detection rule of the data to be tested.
11. the data detection device of multi-data source as claimed in claim 8, which is characterized in that the acquisition module further include:
Second acquisition unit, for obtaining the corresponding rule script of the detection data to be tested that user writes;
Second detection unit refers to for detecting whether receiving the selection for selecting the rule script to correspond to scripting language type It enables;
First determination unit, if for receiving the selection instruction, it is determined that the corresponding code encoder of the selection instruction;
Compilation unit, it is corresponding to obtain the data to be tested for compiling the rule script by the code encoder Target detection rule.
12. the data detection device of multi-data source as claimed in claim 11, which is characterized in that the second acquisition unit is also If obtaining default code compiler for not receiving the selection instruction;
The compilation unit is also used to compile the rule script according to the default code compiler, described to be detected to obtain The corresponding target detection rule of data.
13. the data detection device of multi-data source as claimed in claim 8, which is characterized in that the data of the multi-data source Detection device further include:
Determining module, if for determining that the data to be tested are abnormal data according to the testing result, described in carrying The corresponding task of data to be tested is determined as that task not can be performed;
Forbid execution module, for receiving execution not executable when the executing instruction of task, forbids execution described not Executable task.
14. such as the data detection device of the described in any item multi-data sources of claim 8 to 13, which is characterized in that the detection Module includes:
Computing unit, for calculating the corresponding null value rate of the data to be tested in the Spark cluster;
Judging unit, for judging whether the null value rate is greater than default null value rate;
Second determination unit, if being greater than the default null value rate for the null value rate, it is different for obtaining the data to be tested The testing result of regular data;If the null value rate is less than or equal to the default null value rate, the data to be tested are obtained For the testing result of normal data.
15. a kind of data-detection apparatus of multi-data source, which is characterized in that the data-detection apparatus of the multi-data source includes depositing Reservoir, processor and the Data Detection journey for being stored in the multi-data source that can be run on the memory and on the processor Sequence is realized as described in any one of claims 1 to 7 when the data detection process of the multi-data source is executed by the processor Multi-data source data detection method the step of.
16. a kind of computer readable storage medium, which is characterized in that be stored with most evidences on the computer readable storage medium It is realized when the data detection process of the data detection process in source, the multi-data source is executed by processor as in claim 1 to 7 The step of data detection method of described in any item multi-data sources.
CN201811551768.2A 2018-12-18 2018-12-18 Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source Pending CN109656917A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811551768.2A CN109656917A (en) 2018-12-18 2018-12-18 Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811551768.2A CN109656917A (en) 2018-12-18 2018-12-18 Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source

Publications (1)

Publication Number Publication Date
CN109656917A true CN109656917A (en) 2019-04-19

Family

ID=66114496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811551768.2A Pending CN109656917A (en) 2018-12-18 2018-12-18 Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source

Country Status (1)

Country Link
CN (1) CN109656917A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569234A (en) * 2019-07-30 2019-12-13 深圳市华傲数据技术有限公司 Data checking method and device, electronic equipment and computer readable storage medium
CN110597798A (en) * 2019-09-17 2019-12-20 山东爱城市网信息技术有限公司 Data detection method based on Thrift
CN111104121A (en) * 2019-12-20 2020-05-05 北京字节跳动网络技术有限公司 Detection method, device, equipment and storage medium
CN111177176A (en) * 2019-11-18 2020-05-19 腾讯科技(深圳)有限公司 Data detection method, device and storage medium
CN111291990A (en) * 2020-02-04 2020-06-16 浙江大华技术股份有限公司 Quality monitoring processing method and device
CN111752936A (en) * 2020-06-30 2020-10-09 中国科学院西北生态环境资源研究院 Data detection management method, device, server and readable storage medium
CN112613892A (en) * 2020-12-25 2021-04-06 北京知因智慧科技有限公司 Data processing method and device based on business system and electronic equipment
CN112749164A (en) * 2020-12-30 2021-05-04 北京知因智慧科技有限公司 Data quality analysis method and device and electronic equipment
CN113127482A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Data quality analysis method and device, computer equipment and storage medium
CN111752936B (en) * 2020-06-30 2024-04-26 中国科学院西北生态环境资源研究院 Data detection management method, device, server and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130117219A1 (en) * 2011-11-03 2013-05-09 Microsoft Corporation Architecture for knowledge-based data quality solution
CN106372185A (en) * 2016-08-31 2017-02-01 广东京奥信息科技有限公司 Data preprocessing method for heterogeneous data sources
WO2017024772A1 (en) * 2015-08-10 2017-02-16 刘挺 Personalized and distributed data mining system
CN106777101A (en) * 2016-12-14 2017-05-31 深圳天源迪科信息技术股份有限公司 Data processing engine
CN106844546A (en) * 2016-12-30 2017-06-13 江苏号百信息服务有限公司 Multi-data source positional information fusion method and system based on Spark clusters
CN106874483A (en) * 2017-02-20 2017-06-20 山东鲁能软件技术有限公司 A kind of device and method of the patterned quality of data evaluation and test based on big data technology
CN108985531A (en) * 2017-06-01 2018-12-11 中国科学院深圳先进技术研究院 A kind of multimode isomery electric power big data convergence analysis management system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130117219A1 (en) * 2011-11-03 2013-05-09 Microsoft Corporation Architecture for knowledge-based data quality solution
WO2017024772A1 (en) * 2015-08-10 2017-02-16 刘挺 Personalized and distributed data mining system
CN106372185A (en) * 2016-08-31 2017-02-01 广东京奥信息科技有限公司 Data preprocessing method for heterogeneous data sources
CN106777101A (en) * 2016-12-14 2017-05-31 深圳天源迪科信息技术股份有限公司 Data processing engine
CN106844546A (en) * 2016-12-30 2017-06-13 江苏号百信息服务有限公司 Multi-data source positional information fusion method and system based on Spark clusters
CN106874483A (en) * 2017-02-20 2017-06-20 山东鲁能软件技术有限公司 A kind of device and method of the patterned quality of data evaluation and test based on big data technology
CN108985531A (en) * 2017-06-01 2018-12-11 中国科学院深圳先进技术研究院 A kind of multimode isomery electric power big data convergence analysis management system and method

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569234A (en) * 2019-07-30 2019-12-13 深圳市华傲数据技术有限公司 Data checking method and device, electronic equipment and computer readable storage medium
CN110597798A (en) * 2019-09-17 2019-12-20 山东爱城市网信息技术有限公司 Data detection method based on Thrift
CN110597798B (en) * 2019-09-17 2023-08-25 浪潮卓数大数据产业发展有限公司 Data detection method based on thread
CN111177176B (en) * 2019-11-18 2023-05-16 腾讯科技(深圳)有限公司 Data detection method, device and storage medium
CN111177176A (en) * 2019-11-18 2020-05-19 腾讯科技(深圳)有限公司 Data detection method, device and storage medium
CN111104121A (en) * 2019-12-20 2020-05-05 北京字节跳动网络技术有限公司 Detection method, device, equipment and storage medium
CN111104121B (en) * 2019-12-20 2023-05-16 抖音视界有限公司 Detection method, detection device, detection equipment and storage medium
CN113127482B (en) * 2019-12-31 2024-03-26 奇安信科技集团股份有限公司 Data quality analysis method, device, computer equipment and storage medium
CN113127482A (en) * 2019-12-31 2021-07-16 奇安信科技集团股份有限公司 Data quality analysis method and device, computer equipment and storage medium
CN111291990A (en) * 2020-02-04 2020-06-16 浙江大华技术股份有限公司 Quality monitoring processing method and device
CN111291990B (en) * 2020-02-04 2023-11-07 浙江大华技术股份有限公司 Quality monitoring processing method and device
CN111752936A (en) * 2020-06-30 2020-10-09 中国科学院西北生态环境资源研究院 Data detection management method, device, server and readable storage medium
CN111752936B (en) * 2020-06-30 2024-04-26 中国科学院西北生态环境资源研究院 Data detection management method, device, server and readable storage medium
CN112613892A (en) * 2020-12-25 2021-04-06 北京知因智慧科技有限公司 Data processing method and device based on business system and electronic equipment
CN112613892B (en) * 2020-12-25 2024-03-15 北京知因智慧科技有限公司 Data processing method and device based on service system and electronic equipment
CN112749164A (en) * 2020-12-30 2021-05-04 北京知因智慧科技有限公司 Data quality analysis method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN109656917A (en) Data detection method, device, equipment and the readable storage medium storing program for executing of multi-data source
US11379348B2 (en) System and method for performing automated API tests
US11163731B1 (en) Autobuild log anomaly detection methods and systems
US10417119B2 (en) Dynamic testing based on automated impact analysis
US9558230B2 (en) Data quality assessment
US9612943B2 (en) Prioritization of tests of computer program code
US20180260316A1 (en) Regression testing of sql execution plans for sql statements
US10095602B2 (en) Automated code analyzer
US11762717B2 (en) Automatically generating testing code for a software application
US11341116B2 (en) Techniques for automated data analysis
CN107665171A (en) Automatic regression test method and device
Kirbas et al. The relationship between evolutionary coupling and defects in large industrial software
US20110320876A1 (en) Systems and methods for processing source code during debugging operations
CN108388515A (en) Test data generating method, device, equipment and computer readable storage medium
US9195730B2 (en) Verifying correctness of a database system via extended access paths
CN106815100A (en) Interface test method and device
US10365995B2 (en) Composing future application tests including test action data
CN114185791A (en) Method, device and equipment for testing data mapping file and storage medium
CN111176980B (en) Data analysis method, device and system for separating debugging environment and running environment
US20240086165A1 (en) Systems and methods for building and deploying machine learning applications
WO2014174362A1 (en) Feature model based testing
CN109308256A (en) A kind of java dynamically analyzing of program method, equipment and storage medium
US11726902B1 (en) System and method for automated bot testing scenario simulations
CN113590686B (en) Processing method, device and equipment for ecological environment data index
CN111061632B (en) Automated test method and test system for report data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190419