CN118093368A - Data quality detection method and device, electronic equipment and storage medium - Google Patents

Data quality detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN118093368A
CN118093368A CN202311863909.5A CN202311863909A CN118093368A CN 118093368 A CN118093368 A CN 118093368A CN 202311863909 A CN202311863909 A CN 202311863909A CN 118093368 A CN118093368 A CN 118093368A
Authority
CN
China
Prior art keywords
data quality
data
rule
access control
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311863909.5A
Other languages
Chinese (zh)
Inventor
司忠平
胡斐
王鑫毅
魏鹏
王长生
李海博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202311863909.5A priority Critical patent/CN118093368A/en
Publication of CN118093368A publication Critical patent/CN118093368A/en
Pending legal-status Critical Current

Links

Landscapes

  • Testing And Monitoring For Control Systems (AREA)

Abstract

The embodiment of the invention discloses a data quality detection method, a data quality detection device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a target data quality access control rule corresponding to the delivered model and current model task data to be executed, wherein the target data quality access control rule is set for at least one of data type, data quantity, data value and data field null rate; performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data; and controlling and executing the current model task corresponding to the current model task data based on the target data quality detection result. By the technical scheme provided by the embodiment of the invention, the quality detection of the data in the model to be input can be automatically realized, and the running efficiency and accuracy of the model are improved.

Description

Data quality detection method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to a data quality detection technology, in particular to a data quality detection method, a data quality detection device, electronic equipment and a storage medium.
Background
In daily production activities, running errors, such as running failure, running overtime or wrong results given by the model, can occur in the running process of the online model. Anomalies in the input data of the model can lead to situations in which the model is subject to operational errors. For example, an input data anomaly of a model is a large fluctuation in the data volume compared to the past period, resulting in failure of the model to run or the model giving erroneous results.
At present, after a model is online, a manual detection mode is generally adopted to detect the quality of input data of the model. However, the manual detection mode has the condition of missing detection, so that the model is in a wrong operation state based on the input abnormal data, and the operation efficiency and effect of the model are reduced. And when the model runs and reports errors, the running is overtime or the downstream feedback data result is wrong, the abnormal data can be uniformly checked and positioned. It can be seen that there is an urgent need for a data quality detection method that saves resources and time costs.
Disclosure of Invention
The embodiment of the invention provides a data quality detection method, a device, electronic equipment and a storage medium, which are used for realizing automatic quality detection of data to be input into a model and improving the running efficiency and accuracy of the model.
In a first aspect, an embodiment of the present invention provides a data quality detection method, including:
Acquiring a target data quality access control rule corresponding to a delivered model and current model task data to be executed, wherein the target data quality access control rule is set for at least one of a data type, a data volume, a data value and a data field null rate;
Performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data;
And controlling and executing the current model task corresponding to the current model task data based on the target data quality detection result.
In a second aspect, an embodiment of the present invention further provides a data quality detection apparatus, where the apparatus includes:
The data acquisition module is used for acquiring target data quality access control rules corresponding to the delivered model and current model task data to be executed, wherein the target data quality access control rules are set for at least one of data types, data amounts, data values and data field null rates;
The target data quality detection result determining module is used for detecting the data quality of the current model task data based on the target data quality access control rule and determining a target data quality detection result corresponding to the current model task data;
and the current model task execution module is used for controlling and executing the current model task corresponding to the current model task data based on the target data quality detection result.
In a third aspect, an embodiment of the present invention further provides an electronic device, including: one or more processors; and a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a data quality detection method according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a data quality detection method according to any of the embodiments of the present invention.
According to the technical scheme, the target data quality access control rule corresponding to the delivered model and the current model task data to be executed are obtained, wherein the target data quality access control rule is set for at least one of data types, data amounts, data values and data field null rate; performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data; based on the target data quality detection result, the current model task corresponding to the current model task data is controlled to be executed, so that the quality detection of the data in the model to be input can be automatically performed, and the model operation efficiency and accuracy are improved.
Drawings
Fig. 1 is a flowchart of a data quality detection method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a data quality detection method according to a second embodiment of the present invention;
Fig. 3 is a schematic structural diagram of a data quality detecting device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing a data quality detection method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a data quality detection method according to an embodiment of the present invention, where the method may be implemented by a data quality detection device, and the data quality detection device may be implemented in hardware and/or software, and the data quality detection device may be configured in an electronic device. As shown in fig. 1, the method includes:
S110, acquiring a target data quality access control rule corresponding to the delivered model and current model task data to be executed, wherein the target data quality access control rule is set for at least one of a data type, a data volume, a data value and a data field null rate.
Where a delivered model may refer to a model that has been delivered for use by a user. For example, the delivered model may be, but is not limited to, a functional model of online delivery. The data quality access control rule may refer to a rule generated by linking various data quality detection conditions in the form of access control. The target data quality gate inhibition rule may refer to a configured completed data quality gate inhibition rule. The target data quality access control rule can be used for detecting the data quality of the current model task data to be executed according to a preset rule. The current model task data may refer to data that is currently to be input into the model to complete a task.
Specifically, the target data quality gate inhibition rule may comprise one rule or a plurality of rules. For example, an entrance guard rule is set for at least one of a data type, a data amount, a data value, and a data field null rate. The detection sequence of each access control rule in the target data quality access control rule is not limited.
And S120, performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data.
The target data quality detection result can be used for representing whether the task data of the current model meets the detection result of the target data quality access control rule. If the model task data does not meet the requirement, the situation that the running error of the model occurs when the current model task data is input into the delivered model is indicated, such as running failure, running overtime or the model gives an error result.
Specifically, data quality detection is performed on the current model task data based on target data quality entrance guard rules, the entrance guard rules which are not met by the current model task data are determined, the risk probability that the delivered model executes the current model task data is determined based on rule types corresponding to the unsatisfied entrance guard rules, and the target data quality detection result corresponding to the current model task data is determined based on the risk probability. Further, if the risk probability is smaller than the preset probability threshold, the target data quality detection result can be determined to be low risk.
Illustratively, S120 may include: if the current model task data meets the target data quality access control rule, determining that the target data quality detection result is a valid data result; and if the task data of the current model does not meet the target data quality access control rule, determining that the target data quality detection result is an invalid data result.
Specifically, if at least one access control rule which does not meet the target data quality exists in the current model task data, determining that the target data quality detection result is an invalid data result. And if the current model task data meets all the target data quality access control rules, determining that the target data quality detection result is a valid data result. The method has the advantages that the data to be input into the model is strictly controlled, the failure rate and error reporting rate of the delivered model in operation are effectively reduced, the maintenance cost and maintenance time of the model in failure or error reporting are reduced, and the effective utilization rate of the model is further improved.
S130, based on the target data quality detection result, controlling to execute the current model task corresponding to the current model task data.
Wherein current model task data is input into the delivered model to perform the current model task. Specifically, based on the target data quality detection result, the current model task corresponding to the current model task data is selectively executed. If the target data quality detection result indicates that the task data of the current model does not accord with the normal running condition of the delivered model, generating data modification suggestion information based on the target data quality detection result, and sending the target data quality detection result and the data modification suggestion information to a user or department providing the task data of the current model so as to modify the data, and carrying out data quality detection on the modified data again. Furthermore, the traceability flow can be standardized, the first data provider can be determined, the data can be accurately modified in a refined mode, the operation failure rate of the model is reduced, and meanwhile, the data modification efficiency and accuracy are improved.
According to the technical scheme, the target data quality access control rule corresponding to the delivered model and the current model task data to be executed are obtained, wherein the target data quality access control rule is set for at least one of data types, data amounts, data values and data field null rate; performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data; based on the quality detection result of the target data, the current model task corresponding to the current model task data is controlled to be executed, so that the quality detection of the data in the model to be input can be automatically performed, and the running efficiency and accuracy of the model are improved.
Based on the technical scheme, the method further comprises the following steps: before a target data quality access control rule corresponding to a delivered model is obtained, configuring a preset data quality access control rule corresponding to a model to be delivered; performing rule constraint verification on a preset data quality access control rule, and determining a target rule constraint verification result corresponding to the preset data quality access control rule; and on-line delivery control is carried out on the model to be delivered based on the target rule constraint verification result.
The model development link can be configured with preset data quality access control rules corresponding to the model to be delivered. And the preset data quality access control rule corresponding to the model to be delivered can be configured before a test (such as gray level test) link. The preset data quality access control rule may refer to an access control rule set according to a service requirement and not subjected to rule verification. The preset data quality access control rule can be set based on manual experience, or can be obtained after modification based on historical configuration rules. In order to enable the configured access control rules to be more fit with the model and the service demands of users using the model, the preset data quality access control rules need to be subjected to rule verification. The method has the advantages that the method can conduct targeted access control rule configuration aiming at each model and each user, improves the detection strength of the target data quality access control rule to the input model data, further avoids the situation of model operation errors, and improves the model operation accuracy. The target rule constraint verification result can represent whether the preset data quality access control rule needs to be modified.
Specifically, before a target data quality access control rule corresponding to a delivered model is obtained, a preset data quality access control rule corresponding to the model to be delivered can be configured based on service requirements; performing rule constraint verification on a preset data quality access control rule, and determining a target rule constraint verification result corresponding to the preset data quality access control rule; for example, rule constraint verification needs to be performed on preset data quality gate inhibition rules configured (or modified) by a user, and it is verified whether a threshold value set by the preset data quality gate inhibition rules meets rule constraint conditions, such as a data type, a numerical upper limit threshold value, a data lower limit threshold value, and the like. The rule constraint may be, but is not limited to, a data floor of the delivered model. And on-line delivery control is carried out on the model to be delivered based on the target rule constraint verification result.
On the basis of the technical scheme, the step of performing rule constraint verification on the preset data quality access control rule and determining a target rule constraint verification result corresponding to the preset data quality access control rule can comprise the following steps: if the preset data quality access control rule meets the preset rule constraint condition, gray level test is carried out based on the to-be-delivered model and the preset data quality access control rule, and a target rule constraint verification result corresponding to the preset data quality access control rule is determined; if the preset data quality access control rule does not meet the preset rule constraint condition, determining that a target rule constraint check result corresponding to the preset data quality access control rule is an invalid check result, and re-configuring the preset data quality access control rule to perform rule constraint check again.
The gray level test may refer to a model test performed before on-line delivery of the model. For example, the gray scale test may be, but is not limited to, a model internal test. Gray level testing involves performing multiple types of tests on the entire model, such as tests for model functions or access rules. Correspondingly, the model can be understood as being subjected to public measurement after online delivery. The method has the advantages that various parameters corresponding to the model are perfected, the parameters comprise the data quality access control rule corresponding to the model, and gray level test is conducted on the whole model, so that the situation that the data quality access control rule is set after test, and the model test and the access control rule test are conducted repeatedly is avoided, and the model test efficiency is further improved. Meanwhile, the problem tracing can be performed in a targeted manner in a stage before model delivery.
Specifically, if the preset data quality access control rule meets the preset rule constraint condition, a gray test of a model function or an access control rule can be performed based on the model to be delivered and the preset data quality access control rule, and a target rule constraint verification result corresponding to the model to be delivered is determined; if the preset data quality access control rule does not meet the preset rule constraint condition, determining that a target rule constraint check result corresponding to the preset data quality access control rule is an invalid check result, and re-configuring the preset data quality access control rule to perform rule constraint check again. The target rule constraint verification result can also comprise a test result of a model function or an access control rule.
Based on the above technical solution, "on-line delivery control of the model to be delivered based on the target rule constraint verification result" may include: if the target rule constraint verification result is an effective verification result, binding the target data quality access control rule with the model to be delivered, applying for the bound model to be online, and determining the delivered model after online; and if the target rule constraint checking result is an invalid checking result, reconfiguring the preset data quality access control rule and performing rule constraint checking again.
The valid verification result may mean that all test items of the gray level test meet corresponding test conditions. The invalid check result may mean that at least one test item of the gray test does not satisfy a test condition, such as a model function or an entrance guard rule.
Specifically, if the target rule constraint verification result is an effective verification result, binding the target data quality access control rule with the model to be delivered, applying for the bound model to be online, and determining the delivered model after online. And if the target rule constraint verification result is an invalid verification result, determining a modification mode based on the invalid verification result. And if the invalid check result is that a certain function in the model function is invalid or wrong, adjusting the parameters of the model function aiming at the invalid function, and re-configuring the preset data quality access control rule to perform rule constraint check again. If the invalid check result is that a certain rule in the access control rules is invalid or wrong, parameters in the model are not required to be modified, and only the access control rules with preset data quality are reconfigured to be tested again.
Example two
Fig. 2 is a flowchart of a data quality detection method according to a second embodiment of the present invention, where a process of controlling execution of a current model task is described in detail on the basis of the foregoing embodiment. Wherein the explanation of the same or corresponding terms as those of the above embodiments is not repeated herein. As shown in fig. 2, the method includes:
S210, acquiring a target data quality access control rule corresponding to the delivered model and current model task data to be executed, wherein the target data quality access control rule is set for at least one of a data type, a data volume, a data value and a data field null rate.
S220, performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data.
S230, if the target data quality detection result is a valid data result, the current model task data is input into the delivered model to execute the current model task.
The valid data result can represent that the model error risk does not exist or is within an allowable range when the data is input into the model. Specifically, if the target data quality detection result is a valid data result, the current model task data is input into the delivered model to execute the current model task, so that the model is ensured to run in a controllable range through the detection of the current model task data, the situation of model errors is further reduced, and the model running efficiency and accuracy are improved.
And S240, if the target data quality detection result is an invalid data result, controlling to execute the current model task corresponding to the current model task data based on the rule type of the target data quality access control rule.
Wherein rule types may be used to distinguish between rules of different nature. For example, rule types may be classified into mandatory and non-mandatory rules. Specifically, if the target data quality detection result is an invalid data result, the current model task corresponding to the current model task data is controlled to be executed based on the rule type of the target data quality access control rule. If invalid data in the invalid data results violates the non-mandatory rule, the current model task data can be input into the delivered model to execute the current model task. If at least one invalid data in the invalid data result violates the mandatory rule, the invalid data is required to be modified, and the accurately modified data and the rest of current model task data are input into the delivered model to execute the current model task.
Illustratively, S240 may include: if the rule type of the target data quality access control rule is a preset strong rule, blocking execution of a current model task, and generating and displaying a blocking report; if the rule type of the target data quality access control rule is a preset weak rule, the current model task data is input into the delivered model to execute the current model task, and a pseudo-execution report is generated and displayed.
The preset strong rule may be understood as a mandatory rule. The preset weak rule may be understood as a non-mandatory rule. The congestion report may include data corresponding to a preset strong rule, data source information, and data modification instructions. The pseudo-execution report may include data corresponding to the predetermined weak rule, the predetermined if rule, and the data source information. Data tracing can be performed based on the data source information and the data source is notified to modify the data based on rules in the report.
According to the technical scheme, if the target data quality detection result is a valid data result, the current model task data is input into the delivered model to execute the current model task; if the target data quality detection result is an invalid data result, the execution of the current model task corresponding to the current model task data is controlled based on the rule type of the target data quality access control rule, so that the model is ensured to run in a controllable range through the detection of the current model task data, the error condition of the model is further reduced, and the running efficiency and accuracy of the model are improved.
It should be noted that, in some practical use scenarios, the data analysis mining platform serves a data analyst to provide one-stop model services such as data access, data preprocessing, model development, model training, model deployment, and model reasoning. The model itself can be regarded approximately as an algorithm, and the input to the model is model task data, which runs through the entire life cycle of the model. Upstream of this platform is typically a large data platform or other system that serves as a data source. The platform may pull (or synchronize) data from upstream periodically or on demand. Desensitizing the acquired data, processing the data or training the model, publishing and deploying the model after development is completed, scheduling model tasks at regular time, and sending the generated result to the downstream.
Therefore, the monitoring and operation of the whole platform and the model service are particularly important. The scheme in the embodiment of the invention can be applied to monitoring and operation in the platform and Model service, so that the target data quality access control rule is configured for the Model, the problems can be conveniently and quickly positioned and checked, the attribution operation and maintenance responsibilities are saved, the resource and time cost is saved, and the technical ideas of active operation and maintenance, artificial intelligence research and development operation integration (Model/MLOps) and the like are fused.
The integrated artificial intelligence research, development and operation (Model/MLOps) may refer to unified and managed links such as demand, development, test, integration, deployment and operation in the research, development and operation process of an Artificial Intelligence (AI) software project, so as to realize rapid iteration and effective connection of continuous training, continuous integration, continuous delivery and continuous monitoring of the Model, and efficiently deliver high-quality AI Model reasoning service, thereby helping users to promote AI research, development and operation efficiency and promote user intelligent transformation.
Specifically, for example, current model task data is pulled (or synchronized) from an upstream data source to the platform; when the delivered model is scheduled (called), determining a preset available threshold value corresponding to each access control rule according to a preset target data quality access control rule; detecting current model task data based on a preset available threshold value, comparing the current model task data with the preset available threshold value, and determining a comparison result; and judging the blocking or running model according to the comparison result (whether the target data quality access control rule is met) and the strong and weak rule option. Therefore, the target data quality access control rule configured in the links before the data access and the model operation is used for detecting the data quality of the current model task data to be executed, which is equivalent to the positioning and checking of the problems of setting the checkpoint to isolate the data and the model. And under the condition that the data does not pass through the target data quality access control rule, the data problem can be checked. Under the condition that the data passes through the target data quality access control rule, the model report can be based on the model itself and the investigation of platform service, so that the invention is convenient for positioning and investigation of problems, defining operation and maintenance responsibilities and saving resources and time cost.
The following is an embodiment of a data quality detection apparatus provided in the embodiment of the present invention, which belongs to the same inventive concept as the data quality detection method of the above embodiments, and reference may be made to the embodiment of the data quality detection method for details that are not described in detail in the embodiment of the data quality detection apparatus.
Example III
Fig. 3 is a schematic structural diagram of a data quality detecting device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: a data acquisition module 310, a target data quality detection result determination module 320, and a current model task execution module 330.
The data acquisition module 310 is configured to acquire a target data quality access control rule corresponding to the delivered model and current model task data to be executed, where the target data quality access control rule is set for at least one of a data type, a data amount, a data value, and a data field null rate; the target data quality detection result determining module 320 is configured to perform data quality detection on the current model task data based on a target data quality access control rule, and determine a target data quality detection result corresponding to the current model task data; the current model task execution module 330 is configured to control execution of a current model task corresponding to the current model task data based on the target data quality detection result.
According to the technical scheme, the target data quality access control rule corresponding to the delivered model and the current model task data to be executed are obtained, wherein the target data quality access control rule is set for at least one of data types, data amounts, data values and data field null rate; performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data; based on the quality detection result of the target data, the current model task corresponding to the current model task data is controlled to be executed, so that the quality detection of the data in the model to be input can be automatically performed, and the running efficiency and accuracy of the model are improved.
Illustratively, the target data quality detection result determination module 320 is specifically configured to: if the current model task data meets the target data quality access control rule, determining that the target data quality detection result is a valid data result; and if the task data of the current model does not meet the target data quality access control rule, determining that the target data quality detection result is an invalid data result.
Illustratively, the current model task execution module 330 may include:
the current model task execution sub-module is used for inputting the current model task data into the delivered model to execute the current model task if the target data quality detection result is a valid data result;
And the current model task control sub-module is used for controlling and executing the current model task corresponding to the current model task data based on the rule type of the target data quality access control rule if the target data quality detection result is an invalid data result.
Illustratively, the current model task control submodule is specifically configured to: if the rule type of the target data quality access control rule is a preset strong rule, blocking execution of a current model task, and generating and displaying a blocking report; if the rule type of the target data quality access control rule is a preset weak rule, the current model task data is input into the delivered model to execute the current model task, and a pseudo-execution report is generated and displayed.
Illustratively, the apparatus further comprises:
The preset data quality access control rule configuration module is used for configuring preset data quality access control rules corresponding to the model to be delivered before acquiring target data quality access control rules corresponding to the delivered model;
The target rule constraint verification result determining module is used for performing rule constraint verification on the preset data quality access control rule and determining a target rule constraint verification result corresponding to the preset data quality access control rule;
and the online delivery control module is used for carrying out online delivery control on the model to be delivered based on the target rule constraint verification result.
Illustratively, the target rule constraint verification result determining module is specifically configured to: if the preset data quality access control rule meets the preset rule constraint condition, gray level test is carried out based on the to-be-delivered model and the preset data quality access control rule, and a target rule constraint verification result corresponding to the preset data quality access control rule is determined; if the preset data quality access control rule does not meet the preset rule constraint condition, determining that a target rule constraint check result corresponding to the preset data quality access control rule is an invalid check result, and re-configuring the preset data quality access control rule to perform rule constraint check again.
Illustratively, the online delivery control module is specifically configured to: if the target rule constraint verification result is an effective verification result, binding the target data quality access control rule with the model to be delivered, applying for the bound model to be online, and determining the delivered model after online; and if the target rule constraint checking result is an invalid checking result, reconfiguring the preset data quality access control rule and performing rule constraint checking again.
The data quality detection device provided by the embodiment of the invention can execute the data quality detection method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the data quality detection method.
It should be noted that, in the above embodiment of data quality detection, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Example IV
Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the data quality detection method.
In some embodiments, the data quality detection method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the data quality detection method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the data quality detection method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for detecting data quality, comprising:
Acquiring a target data quality access control rule corresponding to a delivered model and current model task data to be executed, wherein the target data quality access control rule is set for at least one of a data type, a data volume, a data value and a data field null rate;
Performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data;
And controlling and executing the current model task corresponding to the current model task data based on the target data quality detection result.
2. The method according to claim 1, wherein the performing data quality detection on the current model task data based on the target data quality access control rule, and determining a target data quality detection result corresponding to the current model task data, includes:
if the current model task data meets the target data quality access control rule, determining that a target data quality detection result is a valid data result;
And if the current model task data does not meet the target data quality access control rule, determining that the target data quality detection result is an invalid data result.
3. The method according to claim 1, wherein controlling execution of the current model task corresponding to the current model task data based on the target data quality detection result includes:
if the target data quality detection result is a valid data result, inputting the current model task data into the delivered model to execute a current model task;
And if the target data quality detection result is an invalid data result, controlling to execute the current model task corresponding to the current model task data based on the rule type of the target data quality access control rule.
4. The method according to claim 3, wherein the controlling the execution of the current model task corresponding to the current model task data based on the rule type to which the target data quality access control rule belongs includes:
If the rule type of the target data quality access control rule is a preset strong rule, blocking execution of a current model task, and generating and displaying a blocking report;
and if the rule type of the target data quality access control rule is a preset weak rule, inputting the current model task data into the delivered model to execute the current model task, and generating and displaying a pseudo-execution report.
5. The method of claim 1, wherein prior to obtaining the target data quality gate inhibition rule corresponding to the delivered model, the method further comprises:
Configuring a preset data quality access control rule corresponding to a model to be delivered;
Performing rule constraint verification on the preset data quality access control rule, and determining a target rule constraint verification result corresponding to the preset data quality access control rule;
And on the basis of the target rule constraint verification result, performing online delivery control on the model to be delivered.
6. The method of claim 5, wherein the performing rule constraint checking on the preset data quality access rule to determine a target rule constraint checking result corresponding to the preset data quality access rule comprises:
If the preset data quality access control rule meets the preset rule constraint condition, gray level test is carried out based on the model to be delivered and the preset data quality access control rule, and a target rule constraint verification result corresponding to the preset data quality access control rule is determined;
If the preset data quality access control rule does not meet the preset rule constraint condition, determining that a target rule constraint check result corresponding to the preset data quality access control rule is an invalid check result, and re-configuring the preset data quality access control rule to perform rule constraint check again.
7. The method of claim 5, wherein the performing online delivery control of the model to be delivered based on the target rule constraint verification result comprises:
if the target rule constraint verification result is an effective verification result, binding a target data quality access control rule with a model to be delivered, applying for the bound model to be online, and determining the delivered model after online;
And if the target rule constraint checking result is an invalid checking result, reconfiguring a preset data quality access control rule and performing rule constraint checking again.
8. A data quality detection apparatus, the apparatus comprising:
The data acquisition module is used for acquiring target data quality access control rules corresponding to the delivered model and current model task data to be executed, wherein the target data quality access control rules are set for at least one of data types, data amounts, data values and data field null rates;
The target data quality detection result determining module is used for detecting the data quality of the current model task data based on the target data quality access control rule and determining a target data quality detection result corresponding to the current model task data;
and the current model task execution module is used for controlling and executing the current model task corresponding to the current model task data based on the target data quality detection result.
9. An electronic device, the electronic device comprising:
one or more processors;
a memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data quality detection method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a data quality detection method as claimed in any one of claims 1-7.
CN202311863909.5A 2023-12-29 2023-12-29 Data quality detection method and device, electronic equipment and storage medium Pending CN118093368A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311863909.5A CN118093368A (en) 2023-12-29 2023-12-29 Data quality detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311863909.5A CN118093368A (en) 2023-12-29 2023-12-29 Data quality detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118093368A true CN118093368A (en) 2024-05-28

Family

ID=91160788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311863909.5A Pending CN118093368A (en) 2023-12-29 2023-12-29 Data quality detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118093368A (en)

Similar Documents

Publication Publication Date Title
WO2020259421A1 (en) Method and apparatus for monitoring service system
KR20210040861A (en) Fault prediction method, apparatus, electronic device and storage medium
CN115396289B (en) Fault alarm determining method and device, electronic equipment and storage medium
CN115204733A (en) Data auditing method and device, electronic equipment and storage medium
CN113656252B (en) Fault positioning method, device, electronic equipment and storage medium
CN116755974A (en) Cloud computing platform operation and maintenance method and device, electronic equipment and storage medium
CN116431505A (en) Regression testing method and device, electronic equipment, storage medium and product
CN116467161A (en) Application testing method and device, electronic equipment and storage medium
CN118093368A (en) Data quality detection method and device, electronic equipment and storage medium
CN115934550A (en) Test method, test device, electronic equipment and storage medium
CN115437961A (en) Data processing method and device, electronic equipment and storage medium
CN114693116A (en) Method and device for detecting code review validity and electronic equipment
CN114881112A (en) System anomaly detection method, device, equipment and medium
CN113656239A (en) Monitoring method and device for middleware and computer program product
CN112579402A (en) Method and device for positioning faults of application system
CN114003248B (en) Model management method and device, electronic equipment and storage medium
CN115190008B (en) Fault processing method, fault processing device, electronic equipment and storage medium
CN116955504B (en) Data processing method and device, electronic equipment and storage medium
CN117667895A (en) Data processing method, system, equipment and medium based on data braiding
CN118074625A (en) Equipment fault detection method, device, equipment and storage medium
CN117573412A (en) System fault early warning method and device, electronic equipment and storage medium
CN118152282A (en) Plug-in testing method, device, equipment and storage medium
CN116414703A (en) Code quality control method and device, electronic equipment and storage medium
CN118298853A (en) Feedback method and device for speech recognition test abnormality
CN114996157A (en) Method, device, equipment and storage medium for identifying risk of changing code

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination