WO2024020898A1 - Data error detection method, apparatus, electronic device, and storage medium - Google Patents
Data error detection method, apparatus, electronic device, and storage medium Download PDFInfo
- Publication number
- WO2024020898A1 WO2024020898A1 PCT/CN2022/108403 CN2022108403W WO2024020898A1 WO 2024020898 A1 WO2024020898 A1 WO 2024020898A1 CN 2022108403 W CN2022108403 W CN 2022108403W WO 2024020898 A1 WO2024020898 A1 WO 2024020898A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data type
- structure file
- test
- type
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 41
- 238000012360 testing method Methods 0.000 claims abstract description 163
- 238000012937 correction Methods 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 35
- 238000012545 processing Methods 0.000 claims description 41
- 238000013461 design Methods 0.000 claims description 16
- 230000003993 interaction Effects 0.000 claims description 13
- 230000001960 triggered effect Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000005538 encapsulation Methods 0.000 claims description 6
- 238000007619 statistical method Methods 0.000 abstract 1
- 230000008569 process Effects 0.000 description 22
- 238000010586 diagram Methods 0.000 description 11
- 230000006978 adaptation Effects 0.000 description 8
- 230000010354 integration Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000011161 development Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L27/00—Modulated-carrier systems
Definitions
- the present invention relates to the field of data processing technology, in particular to data error detection methods, devices, electronic equipment and storage media.
- Data quality issues may run through all service levels of the entire digital construction. For example, data quality problems may be encountered in many scenarios such as system integration, application transplantation, on-site data collection equipment failure, format conversion caused by dependency library updates, expansion and development of new functions, etc.
- the embodiments of the present invention provide data error detection methods, devices, electronic equipment and storage media.
- a data error detection method includes:
- the embodiment of the present invention determines the structural file as the data comparison target based on the data type in the template data and the statistical characteristics of the data type. By comparing the test data with the structural file, it can quickly determine whether the test data is wrong and improve Improve error detection efficiency.
- determining whether the test data is in error based on a result of the comparison includes at least one of the following:
- test data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error
- the test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file.
- the data is normal.
- the embodiment of the present invention can detect errors in a variety of ways, thereby improving applicability.
- it also includes:
- test data When the test data is erroneous, performing a correction operation on the test data based on the structure file;
- the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations in log files, data quality logging is achieved, making it easier for subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
- performing a correction operation on the test data based on the structure file includes at least one of the following:
- the data type in the test data is corrected.
- the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
- the data types contained in the determined) template data include:
- the first data type and the second data type are determined as data types included in the template data.
- the embodiment of the present invention can extract the data type of the template data and the data type of the object in the template data, thereby improving the richness of the data type.
- it also includes:
- the data type in the structure file and/or the statistical characteristics in the structure file are adjusted based on the structure file adjustment instruction.
- the embodiment of the present invention can adjust the structure file based on user-triggered operations, enriching the flexibility of data processing.
- the statistical characteristics include at least one of the following:
- the maximum length of the data type The maximum length of the data type; the minimum length of the data type; the maximum value of the data type; the minimum value of the data type; the average value of the data type; the maximum time interval of the data type; The minimum time interval for the above data type; the average time interval for the above data type.
- the template data is contained in an instance field of an application programming interface design file, or in a real-time file obtained via real-time access to the application programming interface.
- the method before determining the data type contained in the template data, the method further includes:
- a data processing device includes:
- the first determination module is configured to determine the data type contained in the template data
- the second determination module is configured to determine the statistical characteristics of the data type based on statistics of the values of the data type in the template data;
- the third determination module is configured to determine the structure file based on the data type and the statistical characteristics of the data type
- the receiving module is configured to receive test data
- a comparison module configured to compare the test data with the structure file
- a fourth determination module is configured to determine whether the test data is in error based on the result of the comparison.
- the embodiment of the present invention determines the structural file as the data comparison target based on the data type in the template data and the statistical characteristics of the data type. By comparing the test data with the structural file, it can quickly determine whether the test data is wrong and improve Improve error detection efficiency.
- the fourth determining module is configured to perform at least one of the following:
- test data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error
- the test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file.
- the data is normal.
- the embodiment of the present invention can detect errors in a variety of ways, thereby improving applicability.
- it also includes:
- a correction module configured to perform a correction operation on the test data based on the structure file when the test data has an error; record a data error event and the correction operation in a log file.
- the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations in log files, document logging of data quality is achieved, allowing subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
- the correction module is configured to perform at least one of the following:
- the data type in the test data is corrected.
- the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
- the first determination module is configured to determine a first data type in the template data and an object in the template data; determine a second data type contained in the object; and convert the The first data type and the second data type are determined to be the data types included in the template data.
- the embodiment of the present invention can extract the data type of the template data and the data type of the object in the template data, thereby improving the richness of the data type.
- it also includes:
- an adjustment module configured to display the structure file on a human-computer interaction interface; receive a structure file adjustment instruction triggered via the human-computer interaction interface; and adjust the data type in the structure file based on the structure file adjustment instruction, and/or statistical features in said structure file.
- the embodiment of the present invention can adjust the structure file based on user-triggered operations, enriching the flexibility of data processing.
- An electronic device including:
- the processor is configured to read the executable instructions from the memory and execute the executable instructions to implement the data error detection method as described in any one of the above.
- a computer-readable storage medium on which computer instructions are stored.
- the computer instructions are executed by a processor, the data error detection method as described in any one of the above items is implemented.
- a computer program product includes a computer program that implements the data error detection method as described in any of the above items when executed by a processor.
- FIG. 1 is an exemplary flow chart of a data error detection method according to an embodiment of the present invention.
- Figure 2 is a flow chart of a method for determining a structure file according to an embodiment of the present invention.
- FIG. 3 is a schematic diagram of an exemplary process of determining a structured file using a real-time file according to an embodiment of the present invention.
- Figure 4 is an exemplary schematic diagram of a data correction process according to an embodiment of the present invention.
- FIG. 5 is an exemplary schematic diagram of a process of generating a structure file and performing data correction according to an embodiment of the present invention.
- FIG. 6 is an exemplary structural diagram of a data error detection device according to an embodiment of the present invention.
- FIG. 7 is an exemplary structural diagram of a data error detection device with a processor-memory architecture according to an embodiment of the present invention.
- a unified and scalable solution is provided for flexibly implementing data quality management (including error detection and correction).
- the embodiments of the present invention can be integrated into systems, clouds, platform tools or plug-ins, etc., which can make data quality management easier and more efficient, and can especially effectively support industrial data services and applications.
- FIG. 1 is an exemplary flow chart of a data error detection method according to an embodiment of the present invention. As shown in Figure 1, the method 100 includes:
- Step 101 Determine the data type contained in the template data.
- template data is reference data used to perform data error detection.
- template data can be implemented as data contained in instance fields of an application programming interface (API) design file.
- API application programming interface
- Template data can also be implemented as data contained in a live file, where the live file is obtained via the live access API interface.
- API design files are common.
- API design files usually contain various instance fields.
- the data in these instance fields can be used as reference data in the data error detection process, that is, template data.
- API design files are usually static files.
- API design files can be obtained from various data sources (for example, various development documents).
- the format of the API design file can include JSON, YML, etc., but is not limited to these formats.
- API design files can be exported from some API design and testing tools such as Postman or OpenAPI, etc.
- the API interface can also be used as reference data in the data error detection process, that is, template data.
- the system protocols for accessing the API interface may include: Hyper Text Transfer Protocol (HTTP), Hyper Text Transfer Protocol over Secure Socket Layer (HTTPS) based on Secure Socket Layer, and messages.
- Queuing Telemetry Transport Message Queuing Telemetry Transport, MQTT protocol, etc., but are not limited to these protocols and support protocol extensions.
- the template data can be converted into a unified format, such as JSON format, to facilitate subsequent unified processing.
- the template data is obtained from the instance field of the API design file, and the template data in the real-time file is obtained through the real-time access API interface, and the two paths of template data are merged into the overall template data, and then the merged
- the final template data is converted into JSON format to facilitate subsequent unified processing.
- step 101 includes: determining the first data type in the template data and the object in the template data; determining the second data type contained in the object; determining the first data type and the second data type.
- the data type contained in the template data When the object also contains sub-objects, the data type contained in the sub-object can be further determined, and the data type contained in the sub-object can be determined into the data type of the template data.
- data types include: numerical type (such as integer type or floating point type); character type; string type; Boolean type; queue (array) type; time and date type, etc.
- Step 102 Determine the statistical characteristics of the data type based on counting the values of the data type in the template data.
- the values of each data type in the template data are parsed to determine the statistical characteristics of the data type.
- the statistical characteristics may include at least one of the following: the maximum length of the data type; the minimum length of the data type; the maximum value of the data type; the minimum value of the data type; the average value of the data type; the maximum time interval of the data type; The minimum time interval for a type; the average time interval for a data type, etc.
- the template data contains a time series object, and the time series object contains a queue type.
- the queue type contains multiple objects with their own data types, namely timestamp (timestamp), vibration extreme value (vibmax), vibration mean ( vibmean), vibration effective value (vibrms), vibration skewness (vibskewness), vibration kurtosis (vibkurtosis), temperature (temperature) and humidity (humidity).
- the data type of the timestamp is time and date type (data time); the data type of vibration extreme value, vibration mean value, vibration effective value, vibration skewness, vibration kurtosis, temperature and humidity is numerical type (number).
- the template data also contains values for each type in the queue. The statistical characteristics of each data type can be statistically calculated based on the value of each data type.
- the values of timestamps representing multiple time points when multiple measurement experiments are performed are: 2020-07-16 01:00:04, 2020-07-16 01:00:10 , 2020-07-16 01:00:16, 2020-07-16 01:00:22 and 2020-07-16 01:00:28.
- the time stamp data interval of each measurement test is 6 seconds. Therefore, the statistical characteristics of the timestamp can be obtained as follows: the maximum time interval of the timestamp data is 6 seconds; the minimum time interval of the timestamp data is 6 seconds. ; The average time interval of timestamp data is 6 seconds.
- the multiple vibration extreme values (data type is number) representing multiple measurement tests are: 72.0, 80.0, 82.0, 78.0 and 79.0 respectively.
- the statistical characteristics can be obtained as follows: the maximum value of the vibration extreme value is 82 seconds; the minimum value of the vibration extreme value is 72 seconds; and the average value of the vibration extreme value is 78.2.
- the statistical characteristics of each data type in the template data can be calculated. Among them, the more statistical features, usually the higher the error correction efficiency for this data type.
- Step 103 Determine the structure file based on the data type and the statistical characteristics of the data type.
- a file containing all correspondences between data types and their statistical characteristics is a structure file.
- a time series contains a queue, and the queue contains multiple objects with their own data types.
- the data types in the queue include the time and date type of the timestamp, as well as vibration extreme value, vibration mean value, vibration effective value, vibration Numeric types for skewness, vibration kurtosis, temperature, and humidity.
- each data type in the queue has its own statistical characteristics.
- Each data type in the queue is stored in association with its respective statistical characteristics to obtain a structure file.
- the structure file After the structure file is determined in steps 101 to 103, the structure file can be used to perform data error detection and correction processes.
- Step 104 Receive test data.
- the test data is the object on which data error checking is performed.
- the test data can be the data output by the API interface corresponding to the API design file within the specified test time.
- the test data can be data output during the test time via an API interface that provides the real-time file.
- Step 105 Compare the test data with the structure file.
- test data is compared to the structure file, including doing at least one of the following:
- Step 106 Based on the comparison result, determine whether the test data is in error.
- step 106 includes the following scenarios:
- test data contains data type A, data type B, and data type C, but the data type in the structure file only contains data type A and data type B, then it is deemed that the data types of the test data do not match, and the test data is deemed to be mismatched. Something went wrong.
- test data is deemed to be wrong.
- test data contains data type A, data type B, and data type C, and the value of data type A in the test data is 100. If the data types in the structure file only contain data type A and data type B, and the statistical characteristic indication of data type A in the structure file has a maximum value of 60, it is deemed that the test data is wrong.
- test data is determined to be normal.
- test data contains data type A, data type B and data type C, and the value of data type A in the test data is 60. If the data types in the structure file include data type A, data type B, and data type C, and the statistical characteristic indicator of data type A in the structure file has a maximum value of 100, then the test data is considered normal.
- the method before step 101, further includes: parsing template data from input data having multiple protocol encapsulation formats; and converting the template data into a predetermined data exchange format.
- the predetermined data exchange format can be implemented as JSON format, and the protocols can include: Modbus communication protocol, RS-323 protocol, RS-485 protocol and HART protocol, etc. Therefore, by converting the template data into a predetermined data exchange format, data workers can process the template data without a deep understanding of the communication interaction process.
- the method further includes: when the test data has an error, performing a correction operation on the test data based on the structure file; recording the data error event and the correction operation in the log file. It can be seen that the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations, document logging of data quality is achieved, allowing subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
- performing a correction operation on the test data based on the structure file includes at least one of the following: correcting the value of the corresponding data type in the test data based on the statistical characteristics of the data type contained in the structure file; based on the statistical characteristics of the data type contained in the structure file; Data type, correct the data type in the test data. For example, when the time format in the test data does not match the time format in the structure file, the time format in the test data is corrected based on the time format in the structure file. For another example, when the numerical value in the test data is wrong (for example, the parameter value that should be fixed is wrong), the numerical value in the test data is corrected based on the correct value in the structure file. Therefore, the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
- Figure 2 is a flow chart of a method for determining a structure file according to an embodiment of the present invention.
- the method includes:
- Step 201 Obtain input data, where the input data may have multiple protocol encapsulation formats.
- Step 202 Determine whether the input data comes from the real-time file obtained through the real-time access API interface. If so (corresponding to the "Y" branch), execute step 203 and subsequent steps; otherwise (corresponding to the "N” branch), execute Step 210 and subsequent steps.
- Step 203 Start protocol adaptation processing (Protocol adaptation processing) on the input data from the real-time file.
- Step 204 In the protocol adaptation process, template data is parsed from input data having multiple protocol encapsulation formats.
- Step 205 Convert the template data into JSON format.
- Step 206 Extract the data type contained in the template data in JSON format.
- Step 207 Count the values of each data type in the template data to determine the statistical characteristics of each data type.
- Step 208 Write the data type and the statistical characteristics of the data type into the structure file in an associated manner.
- Step 209 Obtain the structure file used as the data comparison target and used to perform data error correction and data correction, and end this process.
- Step 210 Start executing protocol adaptation processing on the input data from the non-real-time file, parse the template data, and jump to step 205.
- 3 is a schematic diagram of an exemplary process of determining a structured file using a real-time file according to an embodiment of the present invention.
- the template data 40 obtained based on the real-time file includes: a path object 41 and a server object 42 .
- the template data 40 may also include an API name (api) object or a path (paths) object, and so on.
- the path object 41 contains multiple sub-objects, namely the first endpoint sub-object 411, the second endpoint sub-object 412 and the third endpoint sub-object 413, etc., and each The child objects of each endpoint further point to the path item object 50.
- the path item object 50 includes sub-objects such as a get operation (get) 51, a put operation (put) 52, a submit operation (post) 53, and a delete operation (delete) 54.
- Each sub-object of the path item object 50 also points to the operation object 60 .
- the operation object 60 includes a description object 61 and an operation object identification 62, and so on.
- the structures of the remaining objects in the template data 40 except the path object 41 are analyzed. Based on the object structure topology of the template data, a similar object structure topology is established in the structure file.
- variable table object 30 is a child object of server object 42.
- the variable table object 30 contains a variable object 31 .
- the variable object 31 contains multiple data types, namely string (string) 311, numerical value (number) 312, queue (array) 313 and time and date 314. Based on the statistics of the values of these data types, statistical characteristics of each data type can be obtained. For example, for the time and date type 314, the statistical characteristics 414 may include the maximum time interval (maxInterval) and the minimum time interval (minInterval), and so on.
- statistical characteristics 413 may include maximum queue length (maxNum), minimum queue length (minNum), average queue length (aveNum), and so on.
- the statistical features 411 may include the maximum character string length (maxLen), the minimum character string length (minLen), whether it overlaps (OverLap), and so on.
- the statistical characteristics 412 may include the length of the value (numLen), whether it is zero (isZero), whether it is empty (isNul), and so on.
- Figure 4 is an exemplary schematic diagram of a data correction process according to an embodiment of the present invention. As shown in Figure 4, the data correction process includes:
- Step 401 Obtain the input data within the test time through the API interface.
- Step 402 Start protocol adaptation processing on the input data.
- Step 403 In the protocol adaptation process, parse the test data from the test data with multiple protocol encapsulation formats.
- Step 404 Convert the test data to JSON format.
- Step 405 Compare the test data in JSON format with the structure file.
- Step 406 Based on the comparison result, determine whether the test data is wrong. If there is an error (corresponding to the "Y branch"), execute step 407 and subsequent steps; otherwise (corresponding to the "N” branch), execute step 409.
- Step 407 Correct the test data based on the structure file.
- Step 408 Record error events and corrective operations.
- Step 409 Output test data.
- FIG. 5 is an exemplary schematic diagram of a process of generating a structure file and performing data correction according to an embodiment of the present invention.
- the real-time file 80 received via the API interface and the instance field 81 of the API design file are provided to the data processing device 90 as input data.
- the protocol adapter 82 in the data processing device 90 performs protocol adaptation processing on the input data.
- the data processing device 90 includes a structure file generator 83 .
- the structure parser 831 in the structure file generator 83 determines the data type in the input data.
- the feature parser 832 in the structure file generator 83 determines the statistical features of the data type.
- the writer 84 in the data processing device 90 associates the data type determined by the structure parser 831 with the statistical characteristics determined by the feature parser 832 and writes them into the structure file.
- the structure document processing layer 92 includes presentation processing 85 and adjustment processing 91 .
- the structure file is displayed in the display process 85 . In the adjustment process 91, the structure file is adjusted based on the user-triggered operation.
- Structure files can be used to perform data error detection and data correction.
- Test data 86 is provided to data processing device 90 .
- the protocol adapter 87 in the data processing device 90 performs protocol adaptation processing on the test data 86 .
- Data processing means 88 includes a data processor 88 .
- the data processor 88 includes an error detector 881 and a corrector 882 .
- the error detector 881 compares the test data 86 with the structure file obtained from the structure file processing layer 92 to check whether the test data 86 has errors.
- Corrector 882 performs correction on erroneous test data 86 .
- the alarm 89 performs alarm processing, and stores and records data error events and correction operations.
- FIG. 6 is an exemplary structural diagram of a data error detection device according to an embodiment of the present invention. As shown in Figure 6, the data processing device 600 includes:
- the first determination module 601 is configured to determine the data type contained in the template data
- the second determination module 602 is configured to determine the statistical characteristics of the data type based on the value of the data type in the statistical template data;
- the third determination module 603 is configured to determine the structure file based on the data type and the statistical characteristics of the data type
- the receiving module 604 is configured to receive test data
- the comparison module 605 is configured to compare the test data with the structure file
- the fourth determination module 606 is configured to determine whether the test data is in error based on the comparison result.
- the fourth determination module 606 is configured to perform at least one of the following: when the data type contained in the test data does not match the data type in the structure file, determine that the test data has an error; when the test data contains When the value of the data type in the test data does not conform to the statistical characteristics in the structure file, it is determined that the test data is wrong; when the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data is consistent with the data type in the structure file.
- test data When the data type does not match, it is determined that the test data is wrong; when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file, it is determined that the test data is normal ,etc.
- a correction module 607 is also included, configured to perform correction operations on the test data based on the structure file when the test data is erroneous; record data error events and correction operations in the log file.
- the correction module 607 is configured to perform at least one of the following: correcting the values of the corresponding data types in the test data based on the statistical characteristics of the data types contained in the structure files; based on the data contained in the structure files Type, correct the data type in the test data.
- the first determination module 601 is configured to determine the object in the template data; determine the data type contained in the object.
- an adjustment module 608 is also included, configured to display the structure file on the human-computer interaction interface; receive the structure file adjustment instruction triggered via the human-computer interaction interface; and adjust the data in the structure file based on the structure file adjustment instruction. type, and/or statistical characteristics in the structure file.
- Embodiments of the present invention save effort and time to resolve data quality issues and automate data governance with ease. Use data features to quickly abstract standard patterns, reduce the pressure of data quality inspection, effectively record data error conditions and correct operation records, and ensure data quality and efficient team collaboration. Moreover, by automatically generating structure files, data governance can be flexibly implemented, and the module architecture can be easily extended and integrated to facilitate integration with existing products, systems or services. Embodiments of the present invention can implement and expand digital systems, cloud and platform services, improving stability and integration capabilities.
- FIG. 7 is an exemplary structural diagram of a data error detection device with a processor-memory architecture according to an embodiment of the present invention.
- the data error detection device 700 includes a processor 701, a memory 702, and a computer program stored in the memory 702 and executable on the processor 701.
- the memory 702 can be implemented as various storage media such as electrically erasable programmable read-only memory (EEPROM), flash memory (Flash memory), programmable programmable read-only memory (PROM), etc.
- the processor 701 may be implemented to include one or more central processing units or one or more field programmable gate arrays, where the field programmable gate array integrates one or more central processing unit cores.
- the central processing unit or central processing unit core may be implemented as a CPU, an MCU, a DSP, or the like.
- each step is not fixed and can be adjusted as needed.
- the division of each module is only for the convenience of describing the functional division. In actual implementation, one module can be implemented by multiple modules, and the functions of multiple modules can also be implemented by the same module. These modules can be located on the same device. , or it can be on a different device.
- the hardware modules in various embodiments may be implemented mechanically or electronically.
- a hardware module may include specially designed permanent circuits or logic devices (such as a dedicated processor such as an FPGA or ASIC) to perform specific operations.
- Hardware modules may also include programmable logic devices or circuits (eg, including general-purpose processors or other programmable processors) temporarily configured by software to perform specific operations.
- programmable logic devices or circuits eg, including general-purpose processors or other programmable processors
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Debugging And Monitoring (AREA)
Abstract
Disclosed in the embodiments of the present invention are a data error detection method, an apparatus, an electronic device, and a storage medium. The method comprises: determining data types contained in template data; on the basis of statistical analysis of values of the data types in the template data, determining statistical characteristics of the data types; on the basis of the data types and the statistical characteristics of the data types, determining a structure file; receiving test data; comparing the test data with the structure file; and, on the basis of the comparison result, determining whether the test data has had an error. The embodiments of the present invention determine, on the basis of the data types in the template data and the statistical characteristics of the data types, the structure file serving as a data comparison target; and by means of comparing the test data with the structure file, the test data having had an error can be quickly detected, thus improving the data error detection efficiency and the data correction efficiency.
Description
本发明涉及数据处理技术领域,特别是数据检错方法、装置、电子设备及存储介质。The present invention relates to the field of data processing technology, in particular to data error detection methods, devices, electronic equipment and storage media.
互联网对数据的影响巨大。而且,工业发展逐步由基于数据的技术驱动。为了更好利用数据,目前逐渐固化了一些数据集成解决方案或工具、中间件及低代码工具和可视化软件。The impact of the Internet on data is huge. Moreover, industrial development is increasingly driven by data-based technologies. In order to make better use of data, some data integration solutions or tools, middleware, low-code tools and visualization software are gradually solidified.
然而,面对信息孤岛和许多零散的数据,即使可以使用一些数据集成方案来实现系统或数据集成,仍然可能遭遇数据质量(data quality)问题。数据质量问题可能贯穿整个数字化建设的各个服务层次。例如,在系统集成、应用程序移植、现场数据采集设备故障、依赖库(dependency library)更新导致的格式转换、新功能的扩展和开发等诸多场景中,都可能遭遇数据质量问题,However, in the face of information silos and many scattered data, even if some data integration solutions can be used to achieve system or data integration, data quality problems may still be encountered. Data quality issues may run through all service levels of the entire digital construction. For example, data quality problems may be encountered in many scenarios such as system integration, application transplantation, on-site data collection equipment failure, format conversion caused by dependency library updates, expansion and development of new functions, etc.
目前,如何快速地检测到出错数据是待解决的技术问题。Currently, how to quickly detect erroneous data is a technical problem to be solved.
发明内容Contents of the invention
本发明实施方式提出数据检错方法、装置、电子设备及存储介质。The embodiments of the present invention provide data error detection methods, devices, electronic equipment and storage media.
一种数据检错方法,所述方法包括:A data error detection method, the method includes:
确定模板数据包含的数据类型;Determine the data type contained in the template data;
基于统计所述模板数据中的、所述数据类型的值,确定所述数据类型的统计特征;Determine the statistical characteristics of the data type based on counting the values of the data type in the template data;
基于所述数据类型及所述数据类型的统计特征,确定结构文件;Determine the structure file based on the data type and the statistical characteristics of the data type;
接收测试数据;receive test data;
将所述测试数据与所述结构文件进行比较;Compare the test data with the structure file;
基于所述比较的结果,确定所述测试数据是否出错。Based on the result of the comparison, it is determined whether the test data is in error.
可见,本发明实施方式基于模板数据中的数据类型以及数据类型的统计特征,确定出作为数据比较目标的结构文件,通过将测试数据与结构文件进行比较,可以快速确定出测试数据是否出错,提高了检错效率。It can be seen that the embodiment of the present invention determines the structural file as the data comparison target based on the data type in the template data and the statistical characteristics of the data type. By comparing the test data with the structural file, it can quickly determine whether the test data is wrong and improve Improve error detection efficiency.
在示范性实施方式中,所述基于所述比较的结果,确定所述测试数据是否出错包括下列中的至少一个:In an exemplary embodiment, determining whether the test data is in error based on a result of the comparison includes at least one of the following:
当所述测试数据包含的数据类型与所述结构文件中的数据类型不匹配时,确定所述测试数据出错;When the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error;
当所述测试数据中的、所述数据类型的值不符合所述结构文件中的所述统计特征时,确定所述测试数据出错;When the value of the data type in the test data does not comply with the statistical characteristics in the structure file, it is determined that the test data is in error;
当所述测试数据中的、所述数据类型的值不符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型不匹配时,确定所述测试数据出错;When the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data does not match the data type in the structure file, it is determined that the The above test data is wrong;
当所述测试数据中的、所述数据类型的值符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型匹配时,确定所述测试数据正常。The test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file. The data is normal.
因此,本发明实施方式可以通过多种方式检错,提高了适用性。Therefore, the embodiment of the present invention can detect errors in a variety of ways, thereby improving applicability.
在示范性实施方式中,还包括:In an exemplary embodiment, it also includes:
当所述测试数据出错时,基于所述结构文件对所述测试数据执行校正操作;When the test data is erroneous, performing a correction operation on the test data based on the structure file;
在日志文件中,记录数据出错事件以及所述校正操作。In the log file, data error events and the corrective actions are recorded.
可见,本发明实施方式可以对出错的测试数据进行校正,提高了数据质量。而且,通过在日志文件中记录数据出错事件和校正操作,实现了数据质量的日志记录,便于后续处理者在接管、集成或移植数据时及时了解数据状况。It can be seen that the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations in log files, data quality logging is achieved, making it easier for subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
在示范性实施方式中,所述基于所述结构文件对所述测试数据执行校正操作包括下列中的至少一个:In an exemplary embodiment, performing a correction operation on the test data based on the structure file includes at least one of the following:
基于所述结构文件包含的数据类型的统计特征,对所述测试数据中对应数据 类型的值进行校正;Based on the statistical characteristics of the data type contained in the structure file, correct the value of the corresponding data type in the test data;
基于所述结构文件包含的数据类型,对所述测试数据中的数据类型进行校正。Based on the data type contained in the structure file, the data type in the test data is corrected.
因此,本发明实施方式可以对错误数据进行多种类型的准确校正,提高了数据质量。Therefore, the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
在示范性实施方式中,所述确定)模板数据包含的数据类型包括:In an exemplary implementation, the data types contained in the determined) template data include:
确定所述模板数据中的第一数据类型以及所述模板数据中的对象;Determine the first data type in the template data and the object in the template data;
确定所述对象包含的第二数据类型;Determine a second data type contained by the object;
将所述第一数据类型及所述第二数据类型,确定为所述模板数据包含的数据类型。The first data type and the second data type are determined as data types included in the template data.
可见,本发明实施方式可以提取出模板数据的数据类型以及模板数据中的对象的数据类型,提高了数据类型的丰富度。It can be seen that the embodiment of the present invention can extract the data type of the template data and the data type of the object in the template data, thereby improving the richness of the data type.
在示范性实施方式中,还包括:In an exemplary embodiment, it also includes:
在人机交互界面上展示所述结构文件;Display the structure file on the human-computer interaction interface;
接收经由所述人机交互界面触发的结构文件调整指令;Receive structural file adjustment instructions triggered via the human-computer interaction interface;
基于所述结构文件调整指令调整所述结构文件中的数据类型,和/或所述结构文件中的统计特征。The data type in the structure file and/or the statistical characteristics in the structure file are adjusted based on the structure file adjustment instruction.
因此,本发明实施方式可以基于用户触发操作调整结构文件,丰富了数据处理的灵活性。Therefore, the embodiment of the present invention can adjust the structure file based on user-triggered operations, enriching the flexibility of data processing.
在示范性实施方式中,所述统计特征包括下列中的至少一个:In an exemplary embodiment, the statistical characteristics include at least one of the following:
所述数据类型的最大长度;所述数据类型的最小长度;所述数据类型的最大值;所述数据类型的最小值;所述数据类型的平均值;所述数据类型的最大时间间隔;所述数据类型的最小时间间隔;所述数据类型的平均时间间隔。The maximum length of the data type; the minimum length of the data type; the maximum value of the data type; the minimum value of the data type; the average value of the data type; the maximum time interval of the data type; The minimum time interval for the above data type; the average time interval for the above data type.
可见,本发明实施方式的数据类型具有多样性,适用于各种应用环境。It can be seen that the data types of the embodiments of the present invention are diverse and suitable for various application environments.
在示范性实施方式中,所述模板数据包含在应用程序编程接口设计文件的实例字段中,或包含在经由实时访问应用程序编程接口所获取的实时文件中。In an exemplary embodiment, the template data is contained in an instance field of an application programming interface design file, or in a real-time file obtained via real-time access to the application programming interface.
因此,模板数据的数据源具有多样性,适用于各种应用环境。Therefore, the data sources of template data are diverse and suitable for various application environments.
在示范性实施方式中,在所述确定模板数据包含的数据类型之前,所述方法还包括:In an exemplary implementation, before determining the data type contained in the template data, the method further includes:
从具有多种协议封装格式的输入数据中解析出所述模板数据;Parse the template data from input data having multiple protocol encapsulation formats;
将所述模板数据转换为预定的数据交换格式。Convert the template data into a predetermined data exchange format.
因此,通过将模板数据转换为数据交换格式,数据工作员无需深刻理解通信交互过程,即可处理模板数据。Therefore, by converting the template data into a data exchange format, data workers can process the template data without a deep understanding of the communication interaction process.
一种数据处理装置,所述装置包括:A data processing device, the device includes:
第一确定模块,被配置为确定模板数据包含的数据类型;The first determination module is configured to determine the data type contained in the template data;
第二确定模块,被配置为基于统计所述模板数据中的、所述数据类型的值,确定所述数据类型的统计特征;The second determination module is configured to determine the statistical characteristics of the data type based on statistics of the values of the data type in the template data;
第三确定模块,被配置为基于所述数据类型及所述数据类型的统计特征,确定结构文件;The third determination module is configured to determine the structure file based on the data type and the statistical characteristics of the data type;
接收模块,被配置为接收测试数据;The receiving module is configured to receive test data;
比较模块,被配置为将所述测试数据与所述结构文件进行比较;a comparison module configured to compare the test data with the structure file;
第四确定模块,被配置为基于所述比较的结果确定所述测试数据是否出错。A fourth determination module is configured to determine whether the test data is in error based on the result of the comparison.
可见,本发明实施方式基于模板数据中的数据类型以及数据类型的统计特征,确定出作为数据比较目标的结构文件,通过将测试数据与结构文件进行比较,可以快速确定出测试数据是否出错,提高了检错效率。It can be seen that the embodiment of the present invention determines the structural file as the data comparison target based on the data type in the template data and the statistical characteristics of the data type. By comparing the test data with the structural file, it can quickly determine whether the test data is wrong and improve Improve error detection efficiency.
在示范性实施方式中,所述第四确定模块,被配置为执行下列中的至少一个:In an exemplary implementation, the fourth determining module is configured to perform at least one of the following:
当所述测试数据包含的数据类型与所述结构文件中的数据类型不匹配时,确定所述测试数据出错;When the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error;
当所述测试数据中的、所述数据类型的值不符合所述结构文件中的所述统计特征时,确定所述测试数据出错;When the value of the data type in the test data does not comply with the statistical characteristics in the structure file, it is determined that the test data is in error;
当所述测试数据中的、所述数据类型的值不符合所述结构文件中的统计特征 以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型不匹配时,确定所述测试数据出错;When the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data does not match the data type in the structure file, it is determined that the The above test data is wrong;
当所述测试数据中的、所述数据类型的值符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型匹配时,确定所述测试数据正常。The test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file. The data is normal.
因此,本发明实施方式可以通过多种方式检错,提高了适用性。Therefore, the embodiment of the present invention can detect errors in a variety of ways, thereby improving applicability.
在示范性实施方式中,还包括:In an exemplary embodiment, it also includes:
校正模块,被配置为当所述测试数据出错时,基于所述结构文件对所述测试数据执行校正操作;在日志文件中,记录数据出错事件以及所述校正操作。A correction module configured to perform a correction operation on the test data based on the structure file when the test data has an error; record a data error event and the correction operation in a log file.
可见,本发明实施方式可以对出错的测试数据进行校正,提高了数据质量。而且,通过在日志文件中记录数据出错事件和校正操作,实现了数据质量的文档日志记录,便于后续处理者在接管、集成或移植数据时及时了解数据状况。It can be seen that the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations in log files, document logging of data quality is achieved, allowing subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
在示范性实施方式中,所述校正模块,被配置为执行下列中的至少一个:In an exemplary implementation, the correction module is configured to perform at least one of the following:
基于所述结构文件包含的数据类型的统计特征,对所述测试数据中对应数据类型的值进行校正;Based on the statistical characteristics of the data type contained in the structure file, correct the value of the corresponding data type in the test data;
基于所述结构文件包含的数据类型,对所述测试数据中的数据类型进行校正。Based on the data type contained in the structure file, the data type in the test data is corrected.
因此,本发明实施方式可以对错误数据进行多种类型的准确校正,提高了数据质量。Therefore, the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
在示范性实施方式中,所述第一确定模块,被配置为确定所述模板数据中的第一数据类型以及所述模板数据中的对象;确定所述对象包含的第二数据类型;将所述第一数据类型及所述第二数据类型,确定为所述模板数据包含的数据类型。In an exemplary implementation, the first determination module is configured to determine a first data type in the template data and an object in the template data; determine a second data type contained in the object; and convert the The first data type and the second data type are determined to be the data types included in the template data.
可见,本发明实施方式可以提取出模板数据的数据类型以及模板数据中的对象的数据类型,提高了数据类型的丰富度。It can be seen that the embodiment of the present invention can extract the data type of the template data and the data type of the object in the template data, thereby improving the richness of the data type.
在示范性实施方式中,还包括:In an exemplary embodiment, it also includes:
调整模块,被配置为在人机交互界面上展示所述结构文件;接收经由所述人机交互界面触发的结构文件调整指令;基于所述结构文件调整指令调整所述结构 文件中的数据类型,和/或所述结构文件中的统计特征。an adjustment module configured to display the structure file on a human-computer interaction interface; receive a structure file adjustment instruction triggered via the human-computer interaction interface; and adjust the data type in the structure file based on the structure file adjustment instruction, and/or statistical features in said structure file.
因此,本发明实施方式可以基于用户触发操作调整结构文件,丰富了数据处理的灵活性。Therefore, the embodiment of the present invention can adjust the structure file based on user-triggered operations, enriching the flexibility of data processing.
一种电子设备,包括:An electronic device including:
处理器;processor;
存储器,用于存储所述处理器的可执行指令;memory for storing executable instructions for the processor;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述可执行指令以实施如上任一项所述的数据检错方法。The processor is configured to read the executable instructions from the memory and execute the executable instructions to implement the data error detection method as described in any one of the above.
一种计算机可读存储介质,其上存储有计算机指令,所述计算机指令被处理器执行时实施如上任一项所述的数据检错方法。A computer-readable storage medium on which computer instructions are stored. When the computer instructions are executed by a processor, the data error detection method as described in any one of the above items is implemented.
一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实施如上任一项所述的数据检错方法。A computer program product includes a computer program that implements the data error detection method as described in any of the above items when executed by a processor.
下面将通过参照附图详细描述本发明的优选实施例,使本领域的普通技术人员更清楚本发明的上述及其它特征和优点,附图中:Preferred embodiments of the present invention will be described in detail below to make the above and other features and advantages of the present invention more apparent to those skilled in the art with reference to the accompanying drawings, in which:
图1是根据本发明实施方式的数据检错方法的示范性流程图。FIG. 1 is an exemplary flow chart of a data error detection method according to an embodiment of the present invention.
图2是根据本发明实施方式的确定结构文件的方法流程图。Figure 2 is a flow chart of a method for determining a structure file according to an embodiment of the present invention.
图3是根据本发明实施方式利用实时文件确定结构文件的示范性过程的示意图。3 is a schematic diagram of an exemplary process of determining a structured file using a real-time file according to an embodiment of the present invention.
图4是根据本发明实施方式的数据校正过程的示范性示意图。Figure 4 is an exemplary schematic diagram of a data correction process according to an embodiment of the present invention.
图5是根据本发明实施方式的生成结构文件和执行数据校正过程的示范性示意图。FIG. 5 is an exemplary schematic diagram of a process of generating a structure file and performing data correction according to an embodiment of the present invention.
图6是根据本发明实施方式的数据检错装置的示范性结构图。FIG. 6 is an exemplary structural diagram of a data error detection device according to an embodiment of the present invention.
图7是根据本发明实施方式具有处理器-存储器架构的、数据检错装置的示范性结构图。FIG. 7 is an exemplary structural diagram of a data error detection device with a processor-memory architecture according to an embodiment of the present invention.
其中,附图标记如下:Among them, the reference signs are as follows:
标号label | 含义meaning |
100100 |
数据检错方法Data |
101~106101~106 |
步骤 |
201~210201~210 |
步骤 |
4040 |
模板数据 |
4141 | 路径对象path object |
4242 |
服务器对象 |
411411 |
第一端点子对象 |
412412 |
第二端点子对象 |
413413 |
第三端点子对象 |
5050 |
路径项目对象 |
5151 |
获取操作 |
5252 |
放操作 |
5353 |
提交操作Submit |
5454 |
删除操作 |
6060 |
操作对象 |
6161 |
描述对象Describe |
6262 |
操作对象标识 |
3030 |
变量表对象 |
3131 |
变量对象 |
311311 |
字符串 |
312312 |
数值 |
313313 | 队列queue |
314314 |
时间日期 |
411411 |
时间日期的统计特征Statistical characteristics of time and |
412412 |
数值的统计特征Statistical characteristics of |
413413 |
队列的统计特征Statistical characteristics of the |
414414 |
时间日期的统计特征Statistical characteristics of time and |
401~409401~409 |
步骤 |
8080 |
实时文件 |
8181 | API设计文件的实例字段Instance fields of API design files |
9090 | 数据处理装置data processing device |
8282 |
协议适配器 |
8383 |
结构文件生成器 |
831831 |
结构解析器 |
832832 |
特征解析器 |
8484 |
写入器 |
8585 |
展示处理 |
9191 |
调整处理 |
9292 |
结构文件处理层Structure |
8686 |
测试数据 |
8787 |
协议适配器 |
8888 |
数据处理器 |
881881 |
检错器 |
882882 |
校正器 |
8989 |
报警器 |
600600 | 数据处理装置data processing device |
601601 |
第一确定模块 |
602602 |
第二确定模块 |
603603 |
第三确定模块The |
604604 |
接收模块 |
605605 |
比较模块compare |
606606 |
第四确定模块The |
607607 |
校正模块 |
608608 |
调整模块 |
700700 |
数据处理装置 |
701701 |
处理器 |
702702 | 存储器memory |
为使本发明的目的、技术方案和优点更加清楚,以下举实施例对本发明进一步详细说明。In order to make the purpose, technical solutions and advantages of the present invention clearer, the following examples are given to further describe the present invention in detail.
为了描述上的简洁和直观,下文通过描述若干代表性的实施方式来对本发明的方案进行阐述。实施方式中大量的细节仅用于帮助理解本发明的方案。但是很明显,本发明的技术方案实现时可以不局限于这些细节。为了避免不必要地模糊了本发明的方案,一些实施方式没有进行细致地描述,而是仅给出了框架。下文中,“包括”是指“包括但不限于”,“根据……”是指“至少根据……,但不限于仅根据……”。由于汉语的语言习惯,下文中没有特别指出一个成分的数量时,意味着该成分可以是一个也可以是多个,或可理解为至少一个。For the sake of simplicity and intuitiveness in description, the solution of the present invention is explained below by describing several representative embodiments. A large number of details in the embodiments are only used to help understand the solution of the present invention. However, it is obvious that the technical solution of the present invention may not be limited to these details when implemented. In order to avoid unnecessarily obscuring the solutions of the present invention, some embodiments are not described in detail, but only give a framework. Hereinafter, "including" means "including but not limited to", and "based on..." means "at least based on..., but not limited to only based on...". Due to Chinese language habits, when the number of a component is not specified below, it means that the component can be one or more, or it can be understood as at least one.
申请人分析目前的数据治理(data governance)现状,发现至少存在以下问题:The applicant analyzed the current status of data governance and found that there are at least the following problems:
(1)、很难发现数据质量问题。比如,在物联网项目中,很难发现传感器掉落、数据漂移、边缘通信网络中断、数据处理算法公式无法适应变化数据等异常情形。而且,部署环境变化以及依赖库的升级也会导致错误数据或错误日期格式。这些平时潜伏但后续爆发的数据质量问题,需要熟悉整个系统的数据工作员才可以找到解决方案。(1). It is difficult to find data quality problems. For example, in Internet of Things projects, it is difficult to detect abnormal situations such as sensor drop, data drift, edge communication network interruption, and data processing algorithm formulas that cannot adapt to changing data. Moreover, changes in the deployment environment and upgrades of dependent libraries can also lead to incorrect data or incorrect date formats. These data quality problems that usually lurk but subsequently erupt require data workers who are familiar with the entire system to find solutions.
(2)、需要花费大量时间和精力来理解通信交互数据。数据工作员需要花费大量时间和精力来理解每个开发的应用程序或系统的交互数据,例如存在哪些通信接口以及接口中包含的数据和业务需求(比如,数据格式、含义、更新周期,等等)(2) It takes a lot of time and effort to understand the communication interaction data. Data workers need to spend a lot of time and effort to understand the interactive data of each developed application or system, such as what communication interfaces exist and the data and business requirements contained in the interfaces (e.g., data format, meaning, update cycle, etc. )
(3)、目前没有关于数据质量问题的文档或日志记录,因此没有验证数据质量问题的简单方法。由于缺乏关于数据质量的说明和事件记录,在接管、集成或移植应用程序或系统时,数据质量的集成测试处于逐案解决状态。(3). There is currently no documentation or logging on data quality issues, so there is no easy way to verify data quality issues. Integration testing for data quality is addressed on a case-by-case basis when taking over, integrating, or porting an application or system due to a lack of documentation and event logging regarding data quality.
在本发明实施方式中,提供了一种统一且可扩展的解决方案,用于灵活地实现数据质量治理(包括检错和校正)。本发明实施方式可以集成到系统、云、平台工具或插件,等等中,可以使得数据质量治理更加容易和高效,尤其能够有效地支持工业数据的服务和应用。In the embodiment of the present invention, a unified and scalable solution is provided for flexibly implementing data quality management (including error detection and correction). The embodiments of the present invention can be integrated into systems, clouds, platform tools or plug-ins, etc., which can make data quality management easier and more efficient, and can especially effectively support industrial data services and applications.
图1是根据本发明实施方式的数据检错方法的示范性流程图。如图1所示,该方法100包括:FIG. 1 is an exemplary flow chart of a data error detection method according to an embodiment of the present invention. As shown in Figure 1, the method 100 includes:
步骤101:确定模板数据包含的数据类型。Step 101: Determine the data type contained in the template data.
在这里,模板数据为用于执行数据检错的参照数据。比如,模板数据可以实施为包含在应用程序编程接口(API)设计文件的实例字段中的数据。模板数据还可以实施为包含在实时文件中的数据,其中经由实时访问API接口以获取实时文件。Here, the template data is reference data used to perform data error detection. For example, template data can be implemented as data contained in instance fields of an application programming interface (API) design file. Template data can also be implemented as data contained in a live file, where the live file is obtained via the live access API interface.
在系统构建中,常见有API设计文件。API设计文件中通常包含有各种实例字段,这些实例字段中的数据可以作为数据检错过程中的参照数据,即模板数据。API设计文件通常为静态文件。可以从各种数据源(比如,各种开发文档)中获 取API设计文件。比如,API设计文件的格式可以包括JSON和YML等,但不限于这些格式。可以从一些API设计和测试工具(如Postman或OpenAPI等)导出API设计文件。In system construction, API design files are common. API design files usually contain various instance fields. The data in these instance fields can be used as reference data in the data error detection process, that is, template data. API design files are usually static files. API design files can be obtained from various data sources (for example, various development documents). For example, the format of the API design file can include JSON, YML, etc., but is not limited to these formats. API design files can be exported from some API design and testing tools such as Postman or OpenAPI, etc.
用户还可以实时访问API接口以获取实时文件。实时文件中的数据同样可以作为数据检错过程中的参照数据,即模板数据。具体地,访问API接口的系统协议可以包括:超文本传输协议(Hyper Text Transfer Protocol,HTTP)、基于安全套接字层的超文本传输协议(Hyper Text Transfer Protocol over Secure Socket Layer,HTTPS)和消息队列遥测传输(Message Queuing Telemetry Transport,MQTT)协议,等等,但不限于这些协议并支持协议扩展。Users can also access the API interface in real time to obtain real-time files. The data in the real-time file can also be used as reference data in the data error detection process, that is, template data. Specifically, the system protocols for accessing the API interface may include: Hyper Text Transfer Protocol (HTTP), Hyper Text Transfer Protocol over Secure Socket Layer (HTTPS) based on Secure Socket Layer, and messages. Queuing Telemetry Transport (Message Queuing Telemetry Transport, MQTT) protocol, etc., but are not limited to these protocols and support protocol extensions.
优选地,模板数据无论是源自于静态API设计文件或实时文件,都可以被转换为统一格式,比如JSON格式,以便于后续的统一处理。在一个实施方式中,从API设计文件的实例字段中获取模板数据,而且经由实时访问API接口获取实时文件中的模板数据,并且将这两路模板数据合并为整体的模板数据,再将该合并后的模板数据转换为JSON格式,以便于后续的统一处理。Preferably, the template data, whether derived from static API design files or real-time files, can be converted into a unified format, such as JSON format, to facilitate subsequent unified processing. In one implementation, the template data is obtained from the instance field of the API design file, and the template data in the real-time file is obtained through the real-time access API interface, and the two paths of template data are merged into the overall template data, and then the merged The final template data is converted into JSON format to facilitate subsequent unified processing.
在获取模板数据后,对模板数据进行解析以确定出模板数据中包含的数据类型。在一个实施方式中,步骤101包括:确定模板数据中的第一数据类型以及模板数据中的对象(object);确定对象包含的第二数据类型;将第一数据类型及第二数据类型,确定为模板数据包含的数据类型。当对象中还包含子对象时,可以进一步确定出子对象包含的数据类型,并将子对象包含的数据类型确定到模板数据的数据类型中。After the template data is obtained, the template data is parsed to determine the data type contained in the template data. In one embodiment, step 101 includes: determining the first data type in the template data and the object in the template data; determining the second data type contained in the object; determining the first data type and the second data type. The data type contained in the template data. When the object also contains sub-objects, the data type contained in the sub-object can be further determined, and the data type contained in the sub-object can be determined into the data type of the template data.
比如,数据类型包括:数值型(比如整数型或浮点型);字符型;字符串类型;布尔型;队列(array)型;时间日期型,等等。For example, data types include: numerical type (such as integer type or floating point type); character type; string type; Boolean type; queue (array) type; time and date type, etc.
以上示范性描述了数据类型的典型实例,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。The above exemplarily describes typical examples of data types. Those skilled in the art can realize that this description is only exemplary and is not used to limit the protection scope of the embodiments of the present invention.
步骤102:基于统计所述模板数据中的、所述数据类型的值,确定所述数据类型的统计特征。Step 102: Determine the statistical characteristics of the data type based on counting the values of the data type in the template data.
在这里,解析模板数据中的、每个数据类型的值,以确定出该数据类型的统计特征。比如,统计特征可以包括下列中的至少一个:数据类型的最大长度;数据类型的最小长度;数据类型的最大值;数据类型的最小值;数据类型的平均值;数据类型的最大时间间隔;数据类型的最小时间间隔;数据类型的平均时间间隔,等等。Here, the values of each data type in the template data are parsed to determine the statistical characteristics of the data type. For example, the statistical characteristics may include at least one of the following: the maximum length of the data type; the minimum length of the data type; the maximum value of the data type; the minimum value of the data type; the average value of the data type; the maximum time interval of the data type; The minimum time interval for a type; the average time interval for a data type, etc.
下面描述确定模板数据的统计特征的示范性过程。假定模板数据中包含时间序列(timeseries)对象,时间序列对象包含队列类型,队列类型中包含多个具有各自数据类型的对象,分别为时间戳(timestamp)、振动极值(vibmax)、振动均值(vibmean)、振动有效值(vibrms)、振动偏度(vibskewness)、振动峰度(vibkurtosis)、温度(temperature)和湿度(humidity)。其中,时间戳的数据类型为时间日期类型(data time);振动极值、振动均值、振动有效值、振动偏度、振动峰度、温度和湿度的数据类型为数值型(number)。模板数据中还包含队列中的每个类型的数值。可以基于每个数据类型的值,在统计学上统计出该数据类型的统计特征。An exemplary process for determining statistical characteristics of template data is described below. Assume that the template data contains a time series object, and the time series object contains a queue type. The queue type contains multiple objects with their own data types, namely timestamp (timestamp), vibration extreme value (vibmax), vibration mean ( vibmean), vibration effective value (vibrms), vibration skewness (vibskewness), vibration kurtosis (vibkurtosis), temperature (temperature) and humidity (humidity). Among them, the data type of the timestamp is time and date type (data time); the data type of vibration extreme value, vibration mean value, vibration effective value, vibration skewness, vibration kurtosis, temperature and humidity is numerical type (number). The template data also contains values for each type in the queue. The statistical characteristics of each data type can be statistically calculated based on the value of each data type.
比如,表征执行多次测量试验的多个时间点的时间戳(数据类型具体为data time)的值分别为:2020-07-16 01:00:04、2020-07-16 01:00:10、2020-07-16 01:00:16、2020-07-16 01:00:22和2020-07-16 01:00:28。对这些值进行解析,可以发现每次测量试验的时间戳数据间隔6秒,因此可以得到时间戳的统计特征为:时间戳的最大时间间隔为6秒;时间戳数据的最小时间间隔为6秒;时间戳数据的平均时间间隔为6秒。For example, the values of timestamps (data type specifically data time) representing multiple time points when multiple measurement experiments are performed are: 2020-07-16 01:00:04, 2020-07-16 01:00:10 , 2020-07-16 01:00:16, 2020-07-16 01:00:22 and 2020-07-16 01:00:28. By analyzing these values, it can be found that the time stamp data interval of each measurement test is 6 seconds. Therefore, the statistical characteristics of the timestamp can be obtained as follows: the maximum time interval of the timestamp data is 6 seconds; the minimum time interval of the timestamp data is 6 seconds. ;The average time interval of timestamp data is 6 seconds.
再比如,表征多次测量试验中的多个振动极值(数据类型为number)分别为:72.0、80.0、82.0、78.0和79.0。对这些值进行解析,可以得到统计特征为:振动极值的最大值为82秒;振动极值的最小值为72秒;振动极值的平均值为78.2。For another example, the multiple vibration extreme values (data type is number) representing multiple measurement tests are: 72.0, 80.0, 82.0, 78.0 and 79.0 respectively. By analyzing these values, the statistical characteristics can be obtained as follows: the maximum value of the vibration extreme value is 82 seconds; the minimum value of the vibration extreme value is 72 seconds; and the average value of the vibration extreme value is 78.2.
类似地,可以统计出模板数据中的每个数据类型的统计特征。其中,统计特征越多,通常针对该数据类型的纠错效率越高。Similarly, the statistical characteristics of each data type in the template data can be calculated. Among them, the more statistical features, usually the higher the error correction efficiency for this data type.
步骤103:基于所述数据类型及所述数据类型的统计特征,确定结构文件。Step 103: Determine the structure file based on the data type and the statistical characteristics of the data type.
在这里,建立每个数据类型及其统计特征之间的对应关系。包含数据类型及其统计特征之间的全部对应关系的文件,即为结构文件。Here, the correspondence between each data type and its statistical characteristics is established. A file containing all correspondences between data types and their statistical characteristics is a structure file.
承接上例,描述结构文件的示范性实例。比如,时间序列(timeseries)中包含队列,队列中包含具有各自的数据类型的多个对象,队列中的数据类型包括时间戳的时间日期类型,以及振动极值、振动均值、振动有效值、振动偏度、振动峰度、温度和湿度的数值类型。而且,针对队列中的每个数据类型,分别具有各自的统计特征。将队列中的每个数据类型与各自的统计特征关联存储,即得到结构文件。Following the above example, an exemplary example of a structure file is described. For example, a time series contains a queue, and the queue contains multiple objects with their own data types. The data types in the queue include the time and date type of the timestamp, as well as vibration extreme value, vibration mean value, vibration effective value, vibration Numeric types for skewness, vibration kurtosis, temperature, and humidity. Moreover, each data type in the queue has its own statistical characteristics. Each data type in the queue is stored in association with its respective statistical characteristics to obtain a structure file.
以上示范性结构文件的典型实例,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。The above are typical examples of exemplary structure files. Those skilled in the art can realize that this description is only exemplary and is not used to limit the protection scope of the embodiments of the present invention.
当在步骤101~步骤103中确定出结构文件后,后续即可利用该结构文件执行数据检错和校正过程。After the structure file is determined in steps 101 to 103, the structure file can be used to perform data error detection and correction processes.
步骤104:接收测试数据。Step 104: Receive test data.
在这里,测试数据为被执行数据检错的对象。比如,当模板数据实施为包含在API设计文件的实例字段中的数据时,测试数据可以为该API设计文件对应的API接口在指定的测试时间内输出的数据。再比如,当模板数据实施为包含在实时文件中的数据时,测试数据可以为经由提供实时文件的API接口,在测试时间内输出的数据。Here, the test data is the object on which data error checking is performed. For example, when the template data is implemented as data contained in the instance field of the API design file, the test data can be the data output by the API interface corresponding to the API design file within the specified test time. For another example, when the template data is implemented as data contained in a real-time file, the test data can be data output during the test time via an API interface that provides the real-time file.
步骤105:将测试数据与结构文件进行比较。Step 105: Compare the test data with the structure file.
在这里,将测试数据与结构文件进行比较,包括执行下列操作中的至少一个:Here, the test data is compared to the structure file, including doing at least one of the following:
(1):比较测试数据中的类型与结构文件中的类型;(1): Compare the types in the test data with the types in the structure file;
(2):比较测试数据中的类型的值,是否符合结构文件中的对应类型的统计特征。(2): Compare the values of the types in the test data to see whether they conform to the statistical characteristics of the corresponding types in the structure file.
步骤106:基于比较的结果,确定测试数据是否出错。Step 106: Based on the comparison result, determine whether the test data is in error.
在一个实施方式中,步骤106包括下列情形:In one embodiment, step 106 includes the following scenarios:
(1)、当测试数据包含的数据类型与结构文件中的数据类型不匹配时,确 定测试数据出错。(1) When the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is wrong.
比如,测试数据包含的数据类型A、数据类型B和数据类型C,但是结构文件中的数据类型只包含数据类型A和数据类型B,则认定测试数据的数据类型不匹配时,并认定测试数据出错。For example, if the test data contains data type A, data type B, and data type C, but the data type in the structure file only contains data type A and data type B, then it is deemed that the data types of the test data do not match, and the test data is deemed to be mismatched. Something went wrong.
(2)、当测试数据中的、数据类型的值不符合结构文件中的统计特征时,确定测试数据出错。(2) When the value of the data type in the test data does not conform to the statistical characteristics in the structure file, it is determined that the test data is wrong.
比如,测试数据包含的数据类型A的值为100,但是结构文件中的数据类型A的统计特征指示最大值为60,则认定测试数据出错。For example, if the value of data type A contained in the test data is 100, but the statistical characteristics of data type A in the structure file indicate that the maximum value is 60, then the test data is deemed to be wrong.
(3)、当测试数据中的、数据类型的值不符合结构文件中的统计特征以及测试数据包含的数据类型与结构文件中的数据类型不匹配时,确定测试数据出错。(3) When the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is wrong.
比如,测试数据包含的数据类型A、数据类型B和数据类型C,而且测试数据中的数据类型A的值为100。结构文件中的数据类型中只包含数据类型A和数据类型B,而且结构文件中的数据类型A的统计特征指示最大值为60,则认定测试数据出错。For example, the test data contains data type A, data type B, and data type C, and the value of data type A in the test data is 100. If the data types in the structure file only contain data type A and data type B, and the statistical characteristic indication of data type A in the structure file has a maximum value of 60, it is deemed that the test data is wrong.
(4)、当测试数据中的、数据类型的值符合结构文件中的统计特征以及测试数据包含的数据类型与结构文件中的数据类型匹配时,确定测试数据正常。(4) When the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file, the test data is determined to be normal.
比如,测试数据包含的数据类型A、数据类型B和数据类型C,而且测试数据中的数据类型A的值为60。结构文件中的数据类型中包含数据类型A、数据类型B和数据类型C,而且结构文件中的数据类型A的统计特征指示最大值为100,则认定测试数据正常。For example, the test data contains data type A, data type B and data type C, and the value of data type A in the test data is 60. If the data types in the structure file include data type A, data type B, and data type C, and the statistical characteristic indicator of data type A in the structure file has a maximum value of 100, then the test data is considered normal.
以上示范性描述了基于比较的结果,确定测试数据是否出错的典型实例,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。The above exemplarily describes a typical example of determining whether the test data is erroneous based on the comparison results. Those skilled in the art can realize that this description is only exemplary and is not used to limit the protection scope of the embodiments of the present invention.
在一个实施方式中,在步骤101之前,方法还包括:从具有多种协议封装格式的输入数据中解析出模板数据;将模板数据转换为预定的数据交换格式。比如,预定的数据交换格式可以实施为JSON格式,协议可以包括:Modbus通信协议、 RS-323协议、RS-485协议和HART协议,等等。因此,通过将模板数据转换为预定的数据交换格式,数据工作员无需深刻理解通信交互过程,即可处理模板数据。In one embodiment, before step 101, the method further includes: parsing template data from input data having multiple protocol encapsulation formats; and converting the template data into a predetermined data exchange format. For example, the predetermined data exchange format can be implemented as JSON format, and the protocols can include: Modbus communication protocol, RS-323 protocol, RS-485 protocol and HART protocol, etc. Therefore, by converting the template data into a predetermined data exchange format, data workers can process the template data without a deep understanding of the communication interaction process.
在一个实施方式中,该方法还包括:当测试数据出错时,基于结构文件对测试数据执行校正操作;在日志文件中,记录数据出错事件以及校正操作。可见,本发明实施方式可以对出错的测试数据进行校正,提高了数据质量。而且,通过记录数据出错事件和校正操作,实现了数据质量的文档日志记录,便于后续处理者在接管、集成或移植数据时能够及时了解数据状况。In one embodiment, the method further includes: when the test data has an error, performing a correction operation on the test data based on the structure file; recording the data error event and the correction operation in the log file. It can be seen that the embodiment of the present invention can correct erroneous test data and improve data quality. Moreover, by recording data error events and correction operations, document logging of data quality is achieved, allowing subsequent processors to understand the data status in a timely manner when taking over, integrating or migrating data.
在一个实施方式中,基于结构文件对测试数据执行校正操作包括下列中的至少一个:基于结构文件包含的数据类型的统计特征,对测试数据中对应数据类型的值进行校正;基于结构文件包含的数据类型,对测试数据中的数据类型进行校正。比如,当测试数据中的时间格式与结构文件中的时间格式不符合时,基于结构文件中的时间格式对测试数据中的时间格式进行校正。再比如,当测试数据中的数值错误(比如,应该是固定的参数值发生错误)时,基于结构文件中的正确数值对测试数据中的数值进行校正。因此,本发明实施方式可以对出错的数据进行多种类型的准确校正,提高了数据质量。In one embodiment, performing a correction operation on the test data based on the structure file includes at least one of the following: correcting the value of the corresponding data type in the test data based on the statistical characteristics of the data type contained in the structure file; based on the statistical characteristics of the data type contained in the structure file; Data type, correct the data type in the test data. For example, when the time format in the test data does not match the time format in the structure file, the time format in the test data is corrected based on the time format in the structure file. For another example, when the numerical value in the test data is wrong (for example, the parameter value that should be fixed is wrong), the numerical value in the test data is corrected based on the correct value in the structure file. Therefore, the embodiment of the present invention can perform various types of accurate corrections on erroneous data, thereby improving data quality.
图2是根据本发明实施方式的确定结构文件的方法流程图。Figure 2 is a flow chart of a method for determining a structure file according to an embodiment of the present invention.
如图2所示,该方法包括:As shown in Figure 2, the method includes:
步骤201:获取输入数据,其中输入数据可以具有多种协议封装格式。Step 201: Obtain input data, where the input data may have multiple protocol encapsulation formats.
步骤202:判断输入数据是否来自于经由实时访问API接口所获取的实时文件,如果是(对应于“Y”分支),执行步骤203及其后续步骤;否则(对应于“N”分支),执行步骤210及其后续步骤。Step 202: Determine whether the input data comes from the real-time file obtained through the real-time access API interface. If so (corresponding to the "Y" branch), execute step 203 and subsequent steps; otherwise (corresponding to the "N" branch), execute Step 210 and subsequent steps.
步骤203:对来自于实时文件的输入数据启动执行协议适配(Protocol adaptation processing)处理。Step 203: Start protocol adaptation processing (Protocol adaptation processing) on the input data from the real-time file.
步骤204:在协议适配处理中,从具有多种协议封装格式的输入数据中解析出模板数据。Step 204: In the protocol adaptation process, template data is parsed from input data having multiple protocol encapsulation formats.
步骤205:将模板数据转换为JSON格式。Step 205: Convert the template data into JSON format.
步骤206:从JSON格式的模板数据中提取出所包含的数据类型。Step 206: Extract the data type contained in the template data in JSON format.
步骤207:统计模板数据中的每个数据类型的值,以确定每个数据类型的统计特征。Step 207: Count the values of each data type in the template data to determine the statistical characteristics of each data type.
步骤208:将数据类型及数据类型的统计特征,以关联方式写入到结构文件中。Step 208: Write the data type and the statistical characteristics of the data type into the structure file in an associated manner.
步骤209:得到作为数据比较目标的、用于执行数据纠错和数据校正的结构文件,结束本流程。Step 209: Obtain the structure file used as the data comparison target and used to perform data error correction and data correction, and end this process.
步骤210:对来自于非实时文件的输入数据启动执行协议适配处理,解析出模板数据,并跳转到步骤205。Step 210: Start executing protocol adaptation processing on the input data from the non-real-time file, parse the template data, and jump to step 205.
以上对确定结构文件的示范性过程进行说明。图3是根据本发明实施方式利用实时文件确定结构文件的示范性过程的示意图。The above describes an exemplary process of determining the structure file. 3 is a schematic diagram of an exemplary process of determining a structured file using a real-time file according to an embodiment of the present invention.
在图3中,基于实时文件得到的模板数据40包括:路径(paths)对象41和服务器(servers)对象42。示范性地,模板数据40中还可以包括API名字(api)对象或路径(paths)对象,等等。In FIG. 3 , the template data 40 obtained based on the real-time file includes: a path object 41 and a server object 42 . Exemplarily, the template data 40 may also include an API name (api) object or a path (paths) object, and so on.
举例,通过解析路径对象41,发现路径对象41中包含多个子对象,分别为第一端点(endpoint)子对象411、第二端点子对象412和第三端点子对象413,等等,而且每个端点的子对象进一步指向路径项目(path item)对象50。路径项目对象50包括获取操作(get)51、放操作(put)52、提交操作(post)53和删除操作(delete)54等子对象。路径项目对象50的各个子对象还指向操作对象60。操作对象60包括描述对象61和操作对象标识62,等等。类似地,分析出模板数据40中的除路径对象41之外的其余对象的结构。基于模板数据的对象结构拓扑,在结构文件中建立类似的对象结构拓扑。For example, by parsing the path object 41, it is found that the path object 41 contains multiple sub-objects, namely the first endpoint sub-object 411, the second endpoint sub-object 412 and the third endpoint sub-object 413, etc., and each The child objects of each endpoint further point to the path item object 50. The path item object 50 includes sub-objects such as a get operation (get) 51, a put operation (put) 52, a submit operation (post) 53, and a delete operation (delete) 54. Each sub-object of the path item object 50 also points to the operation object 60 . The operation object 60 includes a description object 61 and an operation object identification 62, and so on. Similarly, the structures of the remaining objects in the template data 40 except the path object 41 are analyzed. Based on the object structure topology of the template data, a similar object structure topology is established in the structure file.
确定出模板数据40中的每个对象包含的数据类型。比如,变量表对象30为服务器对象42的子对象。变量表对象30中包含有变量对象31。变量对象31中包含多个数据类型,分别为字符串(string)311、数值(number)312、队列 (array)313和时间日期314。基于针对这些数据类型的值的统计,可以得到每个数据类型的统计特征。比如,针对时间日期类型314,统计特征414可以包括最大时间间隔(maxInterval)和最小时间间隔(minInterval),等等。针对队列313,统计特征413可以包括最大队列长度(maxNum)、最小队列长度(minNum)和平均队列长度(aveNum),等等。针对字符串311,统计特征411可以包括最大字符串长度(maxLen)、最小字符串长度(minLen)和是否重叠(OverLap),等等。针对数值312,统计特征412可以包括数值长度(numLen)、是否为零(isZero)和是否为空(isNul),等等。The data type contained in each object in the template data 40 is determined. For example, variable table object 30 is a child object of server object 42. The variable table object 30 contains a variable object 31 . The variable object 31 contains multiple data types, namely string (string) 311, numerical value (number) 312, queue (array) 313 and time and date 314. Based on the statistics of the values of these data types, statistical characteristics of each data type can be obtained. For example, for the time and date type 314, the statistical characteristics 414 may include the maximum time interval (maxInterval) and the minimum time interval (minInterval), and so on. For queue 313, statistical characteristics 413 may include maximum queue length (maxNum), minimum queue length (minNum), average queue length (aveNum), and so on. For the character string 311, the statistical features 411 may include the maximum character string length (maxLen), the minimum character string length (minLen), whether it overlaps (OverLap), and so on. For the value 312, the statistical characteristics 412 may include the length of the value (numLen), whether it is zero (isZero), whether it is empty (isNul), and so on.
图4是根据本发明实施方式的数据校正过程的示范性示意图。如图4所示,数据校正过程包括:Figure 4 is an exemplary schematic diagram of a data correction process according to an embodiment of the present invention. As shown in Figure 4, the data correction process includes:
步骤401:经由API接口获取测试时间内的输入数据。Step 401: Obtain the input data within the test time through the API interface.
步骤402:对输入数据启动协议适配处理。Step 402: Start protocol adaptation processing on the input data.
步骤403:在协议适配处理中,从具有多种协议封装格式的测试数据中解析出测试数据。Step 403: In the protocol adaptation process, parse the test data from the test data with multiple protocol encapsulation formats.
步骤404:将测试数据转换为JSON格式。Step 404: Convert the test data to JSON format.
步骤405:将JSON格式的测试数据与结构文件进行比较。Step 405: Compare the test data in JSON format with the structure file.
步骤406:基于比较结果,判断测试数据是否出错,如果出错(对应于“Y分支”),执行步骤407及其后续步骤;否则(对应于“N”分支),执行步骤409。Step 406: Based on the comparison result, determine whether the test data is wrong. If there is an error (corresponding to the "Y branch"), execute step 407 and subsequent steps; otherwise (corresponding to the "N" branch), execute step 409.
步骤407:基于结构文件,校正测试数据。Step 407: Correct the test data based on the structure file.
步骤408:记录出错事件和校正操作。Step 408: Record error events and corrective operations.
步骤409:输出测试数据。Step 409: Output test data.
图5是根据本发明实施方式的生成结构文件和执行数据校正过程的示范性示意图。FIG. 5 is an exemplary schematic diagram of a process of generating a structure file and performing data correction according to an embodiment of the present invention.
如图5所示,经由API接口接收的实时文件80以及API设计文件的实例字段81,作为输入数据被提供到数据处理装置90。数据处理装置90中的协议适配 器82对输入数据执行协议适配处理。数据处理装置90包含结构文件生成器83。结构文件生成器83中的结构解析器831确定输入数据中的数据类型。结构文件生成器83中的特征解析器832确定数据类型的统计特征。数据处理装置90中的写入器84将结构解析器831确定的数据类型和特征解析器832确定的统计特征,关联写入到结构文件中。结构文件处理层92包括展示处理85和调整处理91。在展示处理85中展示结构文件。在调整处理91中,基于用户触发操作调整结构文件。As shown in FIG. 5 , the real-time file 80 received via the API interface and the instance field 81 of the API design file are provided to the data processing device 90 as input data. The protocol adapter 82 in the data processing device 90 performs protocol adaptation processing on the input data. The data processing device 90 includes a structure file generator 83 . The structure parser 831 in the structure file generator 83 determines the data type in the input data. The feature parser 832 in the structure file generator 83 determines the statistical features of the data type. The writer 84 in the data processing device 90 associates the data type determined by the structure parser 831 with the statistical characteristics determined by the feature parser 832 and writes them into the structure file. The structure document processing layer 92 includes presentation processing 85 and adjustment processing 91 . The structure file is displayed in the display process 85 . In the adjustment process 91, the structure file is adjusted based on the user-triggered operation.
至此完成针对结构文件的生成处理和调整处理。可以利用结构文件执行数据检错和数据校正。At this point, the generation and adjustment processing of the structure file is completed. Structure files can be used to perform data error detection and data correction.
测试数据86被提供到数据处理装置90。数据处理装置90中的协议适配器87对测试数据86执行协议适配处理。数据处理装置88包括数据处理器88。数据处理器88包括检错器881和校正器882。检错器881将测试数据86与从结构文件处理层92获取的结构文件进行比较,以检查测试数据86是否出错。校正器882对出错的测试数据86执行校正。当测试数据86发生出错时,报警器89执行报警处理,并存储记录数据出错事件以及校正操作。 Test data 86 is provided to data processing device 90 . The protocol adapter 87 in the data processing device 90 performs protocol adaptation processing on the test data 86 . Data processing means 88 includes a data processor 88 . The data processor 88 includes an error detector 881 and a corrector 882 . The error detector 881 compares the test data 86 with the structure file obtained from the structure file processing layer 92 to check whether the test data 86 has errors. Corrector 882 performs correction on erroneous test data 86 . When an error occurs in the test data 86, the alarm 89 performs alarm processing, and stores and records data error events and correction operations.
图6是根据本发明实施方式的数据检错装置的示范性结构图。如图6所示,数据处理装置600包括:FIG. 6 is an exemplary structural diagram of a data error detection device according to an embodiment of the present invention. As shown in Figure 6, the data processing device 600 includes:
第一确定模块601,被配置为确定模板数据包含的数据类型;The first determination module 601 is configured to determine the data type contained in the template data;
第二确定模块602,被配置为基于统计模板数据中的、数据类型的值,确定数据类型的统计特征;The second determination module 602 is configured to determine the statistical characteristics of the data type based on the value of the data type in the statistical template data;
第三确定模块603,被配置为基于数据类型及数据类型的统计特征,确定结构文件;The third determination module 603 is configured to determine the structure file based on the data type and the statistical characteristics of the data type;
接收模块604,被配置为接收测试数据;The receiving module 604 is configured to receive test data;
比较模块605,被配置为将测试数据与结构文件进行比较;The comparison module 605 is configured to compare the test data with the structure file;
第四确定模块606,被配置为基于比较的结果确定测试数据是否出错。The fourth determination module 606 is configured to determine whether the test data is in error based on the comparison result.
在示范性实施方式中,第四确定模块606,被配置为执行下列中的至少一个: 当测试数据包含的数据类型与结构文件中的数据类型不匹配时,确定测试数据出错;当测试数据中的、数据类型的值不符合结构文件中的统计特征时,确定测试数据出错;当测试数据中的、数据类型的值不符合结构文件中的统计特征以及测试数据包含的数据类型与结构文件中的数据类型不匹配时,确定测试数据出错;当测试数据中的、数据类型的值符合结构文件中的统计特征以及测试数据包含的数据类型与结构文件中的数据类型匹配时,确定测试数据正常,等等。In an exemplary embodiment, the fourth determination module 606 is configured to perform at least one of the following: when the data type contained in the test data does not match the data type in the structure file, determine that the test data has an error; when the test data contains When the value of the data type in the test data does not conform to the statistical characteristics in the structure file, it is determined that the test data is wrong; when the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data is consistent with the data type in the structure file. When the data type does not match, it is determined that the test data is wrong; when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file, it is determined that the test data is normal ,etc.
在示范性实施方式中,还包括校正模块607,被配置为当测试数据出错时,基于结构文件对测试数据执行校正操作;在日志文件中,记录数据出错事件以及校正操作。In an exemplary embodiment, a correction module 607 is also included, configured to perform correction operations on the test data based on the structure file when the test data is erroneous; record data error events and correction operations in the log file.
在示范性实施方式中,校正模块607,被配置为执行下列中的至少一个:基于结构文件包含的数据类型的统计特征,对测试数据中对应数据类型的值进行校正;基于结构文件包含的数据类型,对测试数据中的数据类型进行校正。In an exemplary embodiment, the correction module 607 is configured to perform at least one of the following: correcting the values of the corresponding data types in the test data based on the statistical characteristics of the data types contained in the structure files; based on the data contained in the structure files Type, correct the data type in the test data.
在示范性实施方式中,第一确定模块601,被配置为确定模板数据中的对象;确定对象包含的数据类型。In an exemplary implementation, the first determination module 601 is configured to determine the object in the template data; determine the data type contained in the object.
在示范性实施方式中,还包括调整模块608,被配置为在人机交互界面上展示结构文件;接收经由人机交互界面触发的结构文件调整指令;基于结构文件调整指令调整结构文件中的数据类型,和/或结构文件中的统计特征。In an exemplary embodiment, an adjustment module 608 is also included, configured to display the structure file on the human-computer interaction interface; receive the structure file adjustment instruction triggered via the human-computer interaction interface; and adjust the data in the structure file based on the structure file adjustment instruction. type, and/or statistical characteristics in the structure file.
本发明实施方式节省精力和时间来解决数据质量问题,并轻松地自动实施数据治理。使用数据特征快速抽象标准模式,减少数据质量检查的压力,有效记录数据出错状况和校正操作记录,确保数据质量和团队高效协作。而且,通过自动生成结构文件,可以灵活进行数据治理,易于扩展和集成模块架构,便于与现有产品、系统或服务集成。本发明实施方式可以实施和扩展数字系统、云和平台服务,提高稳定性和集成能力。Embodiments of the present invention save effort and time to resolve data quality issues and automate data governance with ease. Use data features to quickly abstract standard patterns, reduce the pressure of data quality inspection, effectively record data error conditions and correct operation records, and ensure data quality and efficient team collaboration. Moreover, by automatically generating structure files, data governance can be flexibly implemented, and the module architecture can be easily extended and integrated to facilitate integration with existing products, systems or services. Embodiments of the present invention can implement and expand digital systems, cloud and platform services, improving stability and integration capabilities.
本发明实施方式还提出了一种具有处理器-存储器架构的数据检错装置。图7是根据本发明实施方式具有处理器-存储器架构的、数据检错装置的示范性结构图。The embodiment of the present invention also provides a data error detection device with a processor-memory architecture. FIG. 7 is an exemplary structural diagram of a data error detection device with a processor-memory architecture according to an embodiment of the present invention.
如图7所示,数据检错装置700包括处理器701、存储器702及存储在存储器702上并可在处理器701上运行的计算机程序,计算机程序被处理器701执行时实现如上任一种的数据检错方法。其中,存储器702具体可以实施为电可擦可编程只读存储器(EEPROM)、快闪存储器(Flash memory)、可编程程序只读存储器(PROM)等多种存储介质。处理器701可以实施为包括一或多个中央处理器或一或多个现场可编程门阵列,其中现场可编程门阵列集成一或多个中央处理器核。具体地,中央处理器或中央处理器核可以实施为CPU或MCU或DSP,等等。As shown in Figure 7, the data error detection device 700 includes a processor 701, a memory 702, and a computer program stored in the memory 702 and executable on the processor 701. When the computer program is executed by the processor 701, any of the above can be achieved. Data error detection methods. Among them, the memory 702 can be implemented as various storage media such as electrically erasable programmable read-only memory (EEPROM), flash memory (Flash memory), programmable programmable read-only memory (PROM), etc. The processor 701 may be implemented to include one or more central processing units or one or more field programmable gate arrays, where the field programmable gate array integrates one or more central processing unit cores. Specifically, the central processing unit or central processing unit core may be implemented as a CPU, an MCU, a DSP, or the like.
需要说明的是,上述各流程和各结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。各模块的划分仅仅是为了便于描述采用的功能上的划分,实际实现时,一个模块可以分由多个模块实现,多个模块的功能也可以由同一个模块实现,这些模块可以位于同一个设备中,也可以位于不同的设备中。It should be noted that not all steps and modules in the above-mentioned processes and structure diagrams are necessary, and some steps or modules can be ignored according to actual needs. The execution order of each step is not fixed and can be adjusted as needed. The division of each module is only for the convenience of describing the functional division. In actual implementation, one module can be implemented by multiple modules, and the functions of multiple modules can also be implemented by the same module. These modules can be located on the same device. , or it can be on a different device.
各实施方式中的硬件模块可以以机械方式或电子方式实现。例如,一个硬件模块可以包括专门设计的永久性电路或逻辑器件(如专用处理器,如FPGA或ASIC)用于完成特定的操作。硬件模块也可以包括由软件临时配置的可编程逻辑器件或电路(如包括通用处理器或其它可编程处理器)用于执行特定操作。至于具体采用机械方式,或是采用专用的永久性电路,或是采用临时配置的电路(如由软件进行配置)来实现硬件模块,可以根据成本和时间上的考虑来决定。The hardware modules in various embodiments may be implemented mechanically or electronically. For example, a hardware module may include specially designed permanent circuits or logic devices (such as a dedicated processor such as an FPGA or ASIC) to perform specific operations. Hardware modules may also include programmable logic devices or circuits (eg, including general-purpose processors or other programmable processors) temporarily configured by software to perform specific operations. As for the specific use of mechanical means, or the use of dedicated permanent circuits, or the use of temporarily configured circuits (such as configured by software) to implement the hardware modules, it can be decided based on cost and time considerations.
以上所述,仅为本发明的较佳实施方式而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the scope of the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.
Claims (18)
- 一种数据检错方法,其特征在于,所述方法包括:A data error detection method, characterized in that the method includes:确定(101)模板数据包含的数据类型;Determine (101) the data type contained in the template data;基于统计所述模板数据中的、所述数据类型的值,确定(102)所述数据类型的统计特征;Determine (102) statistical characteristics of the data type based on statistics of values of the data type in the template data;基于所述数据类型及所述数据类型的统计特征,确定(103)结构文件;Based on the data type and the statistical characteristics of the data type, determine (103) a structural file;接收(104)测试数据;Receive (104) test data;将所述测试数据与所述结构文件进行比较(105);Compare the test data with the structure file (105);基于所述比较的结果,确定(106)所述测试数据是否出错。Based on the results of the comparison, it is determined (106) whether the test data is in error.
- 根据权利要求1所述的数据检错方法,其特征在于,The data error detection method according to claim 1, characterized in that:所述基于所述比较的结果,确定(106)所述测试数据是否出错包括下列中的至少一个:Determining (106) whether the test data is in error based on the result of the comparison includes at least one of the following:当所述测试数据包含的数据类型与所述结构文件中的数据类型不匹配时,确定所述测试数据出错;When the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error;当所述测试数据中的、所述数据类型的值不符合所述结构文件中的所述统计特征时,确定所述测试数据出错;When the value of the data type in the test data does not comply with the statistical characteristics in the structure file, it is determined that the test data is in error;当所述测试数据中的、所述数据类型的值不符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型不匹配时,确定所述测试数据出错;When the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data does not match the data type in the structure file, it is determined that the The above test data is wrong;当所述测试数据中的、所述数据类型的值符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型匹配时,确定所述测试数据正常。The test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file. The data is normal.
- 根据权利要求2所述的数据检错方法,其特征在于,还包括:The data error detection method according to claim 2, further comprising:当所述测试数据出错时,基于所述结构文件对所述测试数据执行校正操作;When the test data is erroneous, performing a correction operation on the test data based on the structure file;在日志文件中,记录数据出错事件以及所述校正操作。In the log file, data error events and the corrective actions are recorded.
- 根据权利要求3所述的数据检错方法,其特征在于,所述基于所述结构文件对所述测试数据执行校正操作包括下列中的至少一个:The data error detection method according to claim 3, characterized in that, performing a correction operation on the test data based on the structure file includes at least one of the following:基于所述结构文件包含的数据类型的统计特征,对所述测试数据中对应数据类型的值进行校正;Based on the statistical characteristics of the data type contained in the structure file, correct the value of the corresponding data type in the test data;基于所述结构文件包含的数据类型,对所述测试数据中的数据类型进行校正。Based on the data type contained in the structure file, the data type in the test data is corrected.
- 根据权利要求1所述的数据检错方法,其特征在于,所述确定(101)模板数据包含的数据类型包括:The data error detection method according to claim 1, characterized in that determining (101) the data type contained in the template data includes:确定所述模板数据中的第一数据类型以及所述模板数据中的对象;Determine the first data type in the template data and the object in the template data;确定所述对象包含的第二数据类型;Determine a second data type contained by the object;将所述第一数据类型及所述第二数据类型,确定为所述模板数据包含的数据类型。The first data type and the second data type are determined as data types included in the template data.
- 根据权利要求1-5中任一项的数据检错方法,其特征在于,还包括:The data error detection method according to any one of claims 1-5, further comprising:在人机交互界面上展示所述结构文件;Display the structure file on the human-computer interaction interface;接收经由所述人机交互界面触发的结构文件调整指令;Receive structural file adjustment instructions triggered via the human-computer interaction interface;基于所述结构文件调整指令,调整所述结构文件中的数据类型和/或所述结构文件中的统计特征。Based on the structure file adjustment instruction, the data type in the structure file and/or the statistical characteristics in the structure file are adjusted.
- 根据权利要求1-5中任一项的数据检错方法,其特征在于,所述统计特征包括下列中的至少一个:The data error detection method according to any one of claims 1-5, characterized in that the statistical characteristics include at least one of the following:所述数据类型的最大长度;所述数据类型的最小长度;所述数据类型的最大值;所述数据类型的最小值;所述数据类型的平均值;所述数据类型的最大时间间隔;所述数据类型的最小时间间隔;所述数据类型的平均时间间隔。The maximum length of the data type; the minimum length of the data type; the maximum value of the data type; the minimum value of the data type; the average value of the data type; the maximum time interval of the data type; The minimum time interval for the above data type; the average time interval for the above data type.
- 根据权利要求1-5中任一项的数据检错方法,其特征在于,所述模板数据包含在应用程序编程接口设计文件的实例字段中,或包含在经由实时访问应用程序编程接口所获取的实时文件中。The data error detection method according to any one of claims 1 to 5, characterized in that the template data is contained in an instance field of an application programming interface design file, or is included in an instance field obtained through real-time access to an application programming interface. in the real-time file.
- 根据权利要求1-5中任一项的数据检错方法,其特征在于,在所述确定(101)模板数据包含的数据类型之前,所述方法还包括:The data error detection method according to any one of claims 1-5, characterized in that, before determining (101) the data type contained in the template data, the method further includes:从具有多种协议封装格式的输入数据中解析出所述模板数据;Parse the template data from input data having multiple protocol encapsulation formats;将所述模板数据转换为预定的数据交换格式。Convert the template data into a predetermined data exchange format.
- 一种数据处理装置,其特征在于,所述装置包括:A data processing device, characterized in that the device includes:第一确定模块(601),被配置为确定模板数据包含的数据类型;The first determination module (601) is configured to determine the data type contained in the template data;第二确定模块(602),被配置为基于统计所述模板数据中的、所述数据类型的值,确定所述数据类型的统计特征;The second determination module (602) is configured to determine the statistical characteristics of the data type based on statistics of the values of the data type in the template data;第三确定模块(603),被配置为基于所述数据类型及所述数据类型的统计特征,确定结构文件;The third determination module (603) is configured to determine the structure file based on the data type and the statistical characteristics of the data type;接收模块(604),被配置为接收测试数据;A receiving module (604) configured to receive test data;比较模块(605),被配置为将所述测试数据与所述结构文件进行比较;A comparison module (605) configured to compare the test data with the structure file;第四确定模块(606),被配置为基于所述比较的结果确定所述测试数据是否出错。The fourth determination module (606) is configured to determine whether the test data is in error based on the result of the comparison.
- 根据权利要求10所述的数据处理装置,其特征在于,The data processing device according to claim 10, characterized in that:所述第四确定模块(606),被配置为执行下列中的至少一个:The fourth determination module (606) is configured to perform at least one of the following:当所述测试数据包含的数据类型与所述结构文件中的数据类型不匹配时,确定所述测试数据出错;When the data type contained in the test data does not match the data type in the structure file, it is determined that the test data is in error;当所述测试数据中的、所述数据类型的值不符合所述结构文件中的所述统计特征时,确定所述测试数据出错;When the value of the data type in the test data does not comply with the statistical characteristics in the structure file, it is determined that the test data is in error;当所述测试数据中的、所述数据类型的值不符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型不匹配时,确定所述测试数据出错;When the value of the data type in the test data does not conform to the statistical characteristics in the structure file and the data type contained in the test data does not match the data type in the structure file, it is determined that the The above test data is wrong;当所述测试数据中的、所述数据类型的值符合所述结构文件中的统计特征以及所述测试数据包含的数据类型与所述结构文件中的所述数据类型匹配时,确定所述测试数据正常。The test is determined when the value of the data type in the test data conforms to the statistical characteristics in the structure file and the data type contained in the test data matches the data type in the structure file. The data is normal.
- 根据权利要求11所述的数据处理装置,其特征在于,还包括:The data processing device according to claim 11, further comprising:校正模块(607),被配置为当所述测试数据出错时,基于所述结构文件对所述测试数据执行校正操作;在日志文件中,记录数据出错事件以及所述校正操作。The correction module (607) is configured to perform a correction operation on the test data based on the structure file when the test data is erroneous; record the data error event and the correction operation in the log file.
- 根据权利要求12所述的数据处理装置,其特征在于,The data processing device according to claim 12, characterized in that:所述校正模块(607),被配置为执行下列中的至少一个:The correction module (607) is configured to perform at least one of the following:基于所述结构文件包含的数据类型的统计特征,对所述测试数据中对应数据类型的值进行校正;Based on the statistical characteristics of the data type contained in the structure file, correct the value of the corresponding data type in the test data;基于所述结构文件包含的数据类型,对所述测试数据中的数据类型进行校正。Based on the data type contained in the structure file, the data type in the test data is corrected.
- 根据权利要求10所述的数据检错装置,其特征在于,所述第一确定模块(701),被配置为确定所述模板数据中的第一数据类型以及所述模板数据中的对象;确定所述对象包含的第二数据类型;将所述第一数据类型及所述第二数据类型,确定为所述模板数据包含的数据类型。The data error detection device according to claim 10, characterized in that the first determination module (701) is configured to determine the first data type in the template data and the objects in the template data; determine The second data type contained in the object; determine the first data type and the second data type as the data types contained in the template data.
- 根据权利要求10-14中任一项的数据检错装置,其特征在于,还包括:The data error detection device according to any one of claims 10-14, further comprising:调整模块(608),被配置为在人机交互界面上展示所述结构文件;接收经由所述人机交互界面触发的结构文件调整指令;基于所述结构文件调整指令,调整所述结构文件中 的数据类型,和/或所述结构文件中的统计特征。The adjustment module (608) is configured to display the structure file on a human-computer interaction interface; receive a structure file adjustment instruction triggered via the human-computer interaction interface; and adjust the structure file based on the structure file adjustment instruction. the data type, and/or the statistical characteristics in the structure file.
- 一种电子设备,其特征在于,包括:An electronic device, characterized by including:处理器(701);processor(701);存储器(702),用于存储所述处理器(701)的可执行指令;Memory (702), used to store executable instructions of the processor (701);所述处理器(701),用于从所述存储器(702)中读取所述可执行指令,并执行所述可执行指令以实施权利要求1-9中任一项所述的数据检错方法。The processor (701) is configured to read the executable instructions from the memory (702) and execute the executable instructions to implement the data error detection described in any one of claims 1-9 method.
- 一种计算机可读存储介质,其上存储有计算机指令,其特征在于,所述计算机指令被处理器执行时实施权利要求1-9中任一项所述的数据检错方法。A computer-readable storage medium on which computer instructions are stored, characterized in that when the computer instructions are executed by a processor, the data error detection method described in any one of claims 1-9 is implemented.
- 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序被处理器执行时实施权利要求1-9中任一项所述的数据检错方法。A computer program product, characterized by comprising a computer program that implements the data error detection method according to any one of claims 1-9 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/108403 WO2024020898A1 (en) | 2022-07-27 | 2022-07-27 | Data error detection method, apparatus, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/108403 WO2024020898A1 (en) | 2022-07-27 | 2022-07-27 | Data error detection method, apparatus, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024020898A1 true WO2024020898A1 (en) | 2024-02-01 |
Family
ID=89704928
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/108403 WO2024020898A1 (en) | 2022-07-27 | 2022-07-27 | Data error detection method, apparatus, electronic device, and storage medium |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024020898A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109726103A (en) * | 2018-05-14 | 2019-05-07 | 平安科技(深圳)有限公司 | Generation method, device, equipment and the storage medium of test report |
CN112232881A (en) * | 2020-10-22 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Data detection method and device, electronic equipment and storage medium |
CN113836038A (en) * | 2021-10-21 | 2021-12-24 | 中国平安人寿保险股份有限公司 | Test data construction method, device, equipment and storage medium |
CN113886242A (en) * | 2021-09-29 | 2022-01-04 | 平安银行股份有限公司 | Data processing method, device, terminal and storage medium |
CN114328274A (en) * | 2022-03-07 | 2022-04-12 | 深圳开源互联网安全技术有限公司 | Test template generation method and device, computer equipment and storage medium |
-
2022
- 2022-07-27 WO PCT/CN2022/108403 patent/WO2024020898A1/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109726103A (en) * | 2018-05-14 | 2019-05-07 | 平安科技(深圳)有限公司 | Generation method, device, equipment and the storage medium of test report |
CN112232881A (en) * | 2020-10-22 | 2021-01-15 | 腾讯科技(深圳)有限公司 | Data detection method and device, electronic equipment and storage medium |
CN113886242A (en) * | 2021-09-29 | 2022-01-04 | 平安银行股份有限公司 | Data processing method, device, terminal and storage medium |
CN113836038A (en) * | 2021-10-21 | 2021-12-24 | 中国平安人寿保险股份有限公司 | Test data construction method, device, equipment and storage medium |
CN114328274A (en) * | 2022-03-07 | 2022-04-12 | 深圳开源互联网安全技术有限公司 | Test template generation method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106656536B (en) | Method and equipment for processing service calling information | |
US9576037B2 (en) | Self-analyzing data processing job to determine data quality issues | |
US11297144B2 (en) | Systems and methods for operation management and monitoring of bots | |
US8473916B2 (en) | Method and system for providing a testing framework | |
US9697104B2 (en) | End-to end tracing and logging | |
CN109471890A (en) | Generation method, terminal device and the medium of report file | |
US10963634B2 (en) | Cross-platform classification of machine-generated textual data | |
CN111083225A (en) | Data processing method and device in Internet of things platform and Internet of things platform | |
US8615526B2 (en) | Markup language based query and file generation | |
CN110908890A (en) | Automatic test method and device for interface | |
US20200210401A1 (en) | Proactive automated data validation | |
US20200257698A1 (en) | Data array of objects indexing | |
CN110019116B (en) | Data tracing method, device, data processing equipment and computer storage medium | |
US10831647B2 (en) | Flaky test systems and methods | |
US10922207B2 (en) | Method, apparatus, and computer-readable medium for maintaining visual consistency | |
CN114840213A (en) | Service instance configuration management method and device | |
US20200210389A1 (en) | Profile-driven data validation | |
WO2024020898A1 (en) | Data error detection method, apparatus, electronic device, and storage medium | |
CN116700778B (en) | Interface difference analysis method, device, storage medium and apparatus | |
US12072838B2 (en) | Method for generating a coherent representation for at least two log files | |
Zuo et al. | Temporal relations extraction and analysis of log events for micro-service framework | |
US20240192970A1 (en) | Automated user interface generation for an application programming interface (api) | |
Abreu | Development of a centralized log management system | |
US20200192902A1 (en) | Subscription handling and in-memory alignment of unsynchronized real-time data streams | |
CN116931965B (en) | Integrated stream processing method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22952360 Country of ref document: EP Kind code of ref document: A1 |