CN116010388A - Data verification method, data acquisition server and data verification system - Google Patents

Data verification method, data acquisition server and data verification system Download PDF

Info

Publication number
CN116010388A
CN116010388A CN202211698617.6A CN202211698617A CN116010388A CN 116010388 A CN116010388 A CN 116010388A CN 202211698617 A CN202211698617 A CN 202211698617A CN 116010388 A CN116010388 A CN 116010388A
Authority
CN
China
Prior art keywords
data
attribute information
batch
reported
acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211698617.6A
Other languages
Chinese (zh)
Inventor
陈天宇
沈汪洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bilibili Technology Co Ltd
Original Assignee
Shanghai Bilibili Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bilibili Technology Co Ltd filed Critical Shanghai Bilibili Technology Co Ltd
Priority to CN202211698617.6A priority Critical patent/CN116010388A/en
Publication of CN116010388A publication Critical patent/CN116010388A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application provides a data verification method, a data acquisition server and a data verification system, wherein the data verification method comprises the following steps: the data acquisition server acquires attribute information to be checked from the service database at intervals of set time, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database, and the reported attribute information is determined based on batch data reported by the service module; verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified; and updating the verification state of the attribute information to be verified in the service database according to the verification result. Therefore, the data acquisition server can timely sense the abnormality occurring in the process of acquiring the target batch data, and the integrity, accuracy and high availability of the data are ensured.

Description

Data verification method, data acquisition server and data verification system
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data verification method. The application also relates to a data acquisition server, a data verification system, a computing device and a computer readable storage medium.
Background
Along with the rapid development of computer technology and internet technology, in order to meet the demands of people on work and life, various online services are created, the related fields and scenes are more and more extensive, different online services can generate a large amount of service data, in order to collect the data of all service parties, the service parties can search the data and view the associated information among the data by using a large data platform, hope that the service parties can report the respective data to a system, and perform unified management on the data of each service party.
In the prior art, a service module actively reports data through a message middleware to achieve the effect of collecting the data of each service party, once a service party system is unstable or the message middleware is unstable, the reported data can be lost, the system can not timely sense the abnormality of data loss, and can only wait for a user to find that the data can not be found to perform positioning and complement, so that the integrity, the accuracy and the high availability of the data in the system can not be ensured.
Disclosure of Invention
In view of this, the embodiment of the application provides a data verification method. The application relates to a data acquisition server, a data verification system, a computing device and a computer readable storage medium simultaneously, so as to solve the technical problems that a system in the prior art cannot timely sense the abnormality of data loss, and the data cannot be guaranteed to be complete, accurate and high in availability.
According to a first aspect of an embodiment of the present application, a data verification method is provided, applied to a data acquisition server, and the method includes:
acquiring attribute information to be checked from a service database at intervals of set time, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database, and the reported attribute information is determined based on batch data reported by a service module;
verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified;
and updating the verification state of the attribute information to be verified in the service database according to the verification result.
According to a second aspect of an embodiment of the present application, there is provided a data acquisition server, including:
the acquisition module is configured to acquire attribute information to be checked from the service database at intervals of set time, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database, and the reported attribute information is determined based on batch data reported by the service module;
the verification module is configured to verify the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified;
and the updating module is configured to update the verification state of the attribute information to be verified in the service database according to the verification result.
According to a third aspect of embodiments of the present application, there is provided a data verification system, including at least one service module, a data collector, and a data collection server;
the data acquisition device is configured to determine reporting attribute information corresponding to the batch data based on the batch data reported by the service module, and store the reporting attribute information into the service database;
The data acquisition server side is configured to acquire attribute information to be checked from the service database at intervals of set time, wherein the attribute information to be checked is non-checked reported attribute information stored in the service database; verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified; and updating the verification state of the attribute information to be verified in the service database according to the verification result.
According to a fourth aspect of embodiments of the present application, there is provided a computing device comprising:
a memory and a processor;
the memory is used for storing computer executable instructions, and the processor is used for executing the computer executable instructions to implement the operation steps of the data verification method.
According to a fifth aspect of embodiments of the present application, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the operational steps for implementing the data verification method described above.
According to the data verification method provided by the embodiment of the application, the data acquisition server acquires attribute information to be verified from the service database at intervals of set time, wherein the attribute information to be verified is unchecked reporting attribute information stored in the service database, and the reporting attribute information is determined based on batch data reported by the service module; verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified; and updating the verification state of the attribute information to be verified in the service database according to the verification result.
Under the condition, the reported attribute information determined based on the batch data reported by the service module can be stored in the service database, the attribute information of the statistical record when the data acquisition server acquires the target batch data can be stored in the cache database, the data acquisition server can acquire the attribute information to be checked from the service database at intervals of set time, then check the attribute information to be checked based on the corresponding target acquired attribute information acquired from the cache database, and once the attribute information to be checked is inconsistent with the corresponding target acquired attribute information, the data acquisition server can timely and actively sense, so that the data acquisition server can timely sense the abnormality occurring in the process of acquiring the target batch data without waiting for user discovery, and the integrity, the accuracy and the high availability of the data are ensured.
Drawings
FIG. 1 is a flow chart of a data verification method according to an embodiment of the present application;
FIG. 2 is a process flow diagram of a data verification method applied to metadata according to an embodiment of the present application;
FIG. 3 is an interactive schematic diagram of a data verification process according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data acquisition server according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a data verification system according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of a computing device according to one embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is, however, susceptible of embodiment in many other ways than those herein described and similar generalizations can be made by those skilled in the art without departing from the spirit of the application and the application is therefore not limited to the specific embodiments disclosed below.
The terminology used in one or more embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of one or more embodiments of the application. As used in this application in one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of the present application to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present application will be explained.
Incremental data: incremental data refers to capturing data at a certain time (update time) or after a checkpoint (checkpoint) for synchronization, and is not an irregular full-scale synchronization.
And (3) data acquisition: the data acquisition is a process of transmitting various parameters of a measured object to a controller through steps of signal conditioning, sampling, quantization, encoding, transmission and the like after the parameters are properly converted through various sensors; in the current service scene, various metadata information is collected from each service party, and is reported to a system after data filtering, conversion and other processes.
Bitmap algorithm (bitmap): the core idea of the bitmap algorithm is to record two states of 0-1 with a bit array, and then map specific data to specific positions of the bit array, where a bit set to 0 indicates that data is absent and a bit set to 1 indicates that data is present.
Agent-Pull mode: the centralized service performs unified data acquisition on the service party needing to collect data, and after the specification of an acquisition interface is set, the service party does not need to pay attention to the problems of data output and reporting any more, and is completely actively controlled by the Agent.
Metadata: one simple definition is data that describes data. In an enterprise, wherever there is data, there is corresponding metadata. Only if there is complete and accurate metadata, the data can be better understood and the value of the data fully utilized. In a big data platform, metadata contains the resource data used inside the platform and its associated blood-related data.
And (3) a data verification system: the system for realizing data acquisition and verification is used for collecting and verifying the data information of all service modules in the big data platform, cleaning all heterogeneous data information, converting and storing in a unified format, constructing the association relation among service metadata through the acquired metadata, facilitating unified query of service parties, and reducing the mutual calling among all service systems in the platform.
In order to collect data of all service parties, the service parties are expected to report respective data to a system for unified management of the data of each service party, wherein the data is used for searching the data and checking associated information between the data by a big data platform.
In the prior art, a service module actively reports data through a message middleware to achieve the effect of collecting the data of each service party, once a service party system is unstable or the message middleware is unstable, the reported data can be lost, the system can not timely sense the abnormality of data loss, and can only wait for a user to find that the data can not be found to perform positioning and complement, so that the integrity, the accuracy and the high availability of the data in the system can not be ensured.
Therefore, the embodiment of the application provides a data verification method, wherein incremental data acquisition is actively carried out on data of a business party through a agent-pull mode, the integrity of the finally acquired data is ensured by comparing the reported data quantity with the consumed data quantity in the acquisition process, once the inconsistent data quantity is found, the system can actively sense, and complement restoration can be carried out through a subsequent automatic complement flow, so that high quality and high availability of the data are ensured.
In the present application, a data verification method is provided, and the present application relates to a data acquisition server, a data verification system, a computing device, and a computer readable storage medium, which are described in detail in the following embodiments one by one.
Fig. 1 shows a flowchart of a data verification method according to an embodiment of the present application, which is applied to a data acquisition server, and specifically includes the following steps:
step 102: and acquiring attribute information to be checked from the service database at intervals of set time, wherein the attribute information to be checked is unchecked reporting attribute information stored in the service database, and the reporting attribute information is determined based on batch data reported by the service module.
Specifically, the set duration is a preset time interval for checking the reporting attribute information which is not checked in the service database, and if the set duration can be 3 seconds or 5 seconds. The service database is a database for storing data information related to each service party, the attribute information to be checked is the unchecked reporting attribute information stored in the service database, and the reporting attribute information is determined based on batch data reported by the service module.
The batch data is data reported by a batch of service modules, and the data can be actual service data or metadata.
It should be noted that, the data acquisition server may acquire the attribute information to be checked from the service database at intervals of a set duration through the data quality checking thread, so as to check the attribute information to be checked later. In addition, before the attribute information to be checked is acquired, batch data reported by the service module can be acquired through the data acquisition device, and the reported attribute information of the reported batch data is written into the service database.
In one possible implementation manner, a full-volume update manner may be adopted, that is, the data collector may send a data acquisition request to the service module, and after the service module receives the data acquisition request, the service module may report all data as batch data to the data collector. In addition, in the full-volume updating mode, the data volume is larger, and the batch data can be reported, that is, the data acquisition request can carry an offset, the offset is the data volume of one batch of data, after the business module receives the data acquisition request, the business module can determine the starting identification and the data volume of one batch of data based on the offset, acquire the batch data with the corresponding number according to the starting identification and the data volume, and report the batch data with the corresponding number to the data collector.
In another possible implementation manner, in order to save processing resources, an incremental update manner may be adopted, that is, the data collector may send a data acquisition request to the service module, where the data acquisition request carries an incremental update point. The business module may then report the data updated after the incremental update point to the data collector as batch data. In addition, since a certain increment update point may suddenly update a large amount of data, that is, the increment update data amount may be too large, batch reporting may be performed, that is, the data acquisition request carries the increment update point and the offset, after receiving the acquisition request, the service module may determine the start identifier and the data amount of the batch data based on the increment update point and the offset, acquire the batch data of a corresponding amount according to the start identifier and the data amount, and report the batch data of a corresponding amount to the data collector.
The incremental update point refers to an update time point or a check point where update data needs to be obtained, and the offset is the amount of data that needs to be reported by a batch.
The incremental update point may be time or a data ID. The data updating modes adopted by different service modules may be different, if the data is updated by taking time as a dimension, new data can be added with the increase of time, the data ID is increased, if the historical data is modified, the data ID is the old ID before, that is, the data ID does not necessarily increase in sequence, and if the updated data after a certain time point is 10, 11, 12, 5, 7, 8 and 13 … …; if the data ID is used as a dimension for updating, a new ID is added to each update, that is, the data ID sequentially grows, for example, the update point is ID10, and the update data after the update point is 11, 12, 13, … ….
The data acquisition device can store the updating modes of different service parties, and the historical updating starting points and the historical updating ending points, when a data acquisition request is sent to the service module, the updating mode corresponding to the service module can be determined, and the data acquisition request carries the determined increment updating points and the determined offset, so that the service module can acquire corresponding batch data and report the batch data to the data acquisition device.
For example, assuming that the update is performed in the time dimension, the incremental update point is 11 points, the offset is 1000, at this time, the service module may determine that the updated data after 11 points is data 10, 11, 12, 5, 7, 8, 13, 14 … … 1001, … …, at this time, may determine that the start identifier is 10, the data volume is 1000, report 1000 data after the data 10 as a batch of data to the data collector, and determine the data identifier of the last data of the batch of data, and record as the end identifier.
It should be noted that, through the data acquisition ware initiative to gather the data of service module, even service module is in the unavailable state, data acquisition ware can in time perceive, and after service module resumes, can gather the incremental data after the appointed increment update point, prevent the data loss problem that causes because of service module is unstable.
In practical application, when the service module receives the data acquisition request, if the data acquisition request carries an incremental update point and an offset, the data updated after the incremental update point can be determined first, then the data with the corresponding quantity is selected as batch data based on the offset and reported to the data collector, each batch can record a start identifier and a stop identifier, then the next batch data is selected after the stop identifier of the batch data, and reporting is continued until reporting is completed.
The data collector can receive the batch data reported by the service module, determine reporting attribute information corresponding to the batch data based on the batch data reported by the service module, and store the reporting attribute information into the service database. The reporting attribute information is related attributes of batch data reported to the data collector by the service module, and the reporting attribute information can include data quantity, and can also include other attributes capable of identifying data information and data quality, such as data size, data description and the like.
In this embodiment of the present application, the data collector may send a data acquisition request to the service module, after the service module receives the data acquisition request, report the updated data after the incremental update point to the data collector in batches, the data collector may determine reporting attribute information of each batch of data and write the reporting attribute information into the service database, and the data collection server may acquire attribute information to be checked from the service database every set time interval, so as to check each reporting attribute, determine whether an abnormality occurs in an acquisition process of the batch of data, perform data collection through a streaming batch, and ensure that data of each batch is complete and available by comparing the reporting attribute information of each batch.
Furthermore, since the data collector needs to collect the data of multiple service modules and transmit the data to the data collection server, the data collection server needs to store and process the data of each service module, before implementing the method, a worker can communicate with each service party first, unify the data format of the data to be reported by the service party, confirm the checkpoint of the data, such as a timestamp or a data ID, and pull the incremental data through the checkpoint. In addition, the service modules of all service parties need to be subjected to interface development according to the interface specification of the data acquisition device for pulling the data, and after the interfaces are ensured to be normally called and meet the interface specification, the data reporting and the data quality verification are started.
In the embodiment of the application, the service module can provide an HTTP interface for reporting data, where the HTTP interface needs to allow the data collector to call to obtain the data, and can support obtaining incremental data in a specified start identifier and a specified end identifier (generally time intervals), and perform training traversal through the service module, obtain all the incremental data in the time intervals in batches, so as to ensure that the HTTP request is not too large to cause a timeout problem.
In an optional implementation manner of this embodiment, the attribute information to be verified includes a verification state, and obtaining the attribute information to be verified from the service database includes:
traversing each reporting attribute information stored in the service database;
and taking the reporting attribute information with the checking state of each reporting attribute information being unverified as the attribute information to be checked.
In practical application, each reporting attribute information stored in the service database may include a corresponding check state, where the check state includes an uncore state and a checked state, and the checked state is classified into a check uncore state and a check passing state.
It should be noted that, each piece of reported attribute information stored in the service database may include a corresponding verification state, so that the data quality verification thread in the data acquisition server may traverse each piece of reported attribute information stored in the service database, and use the reported attribute information whose verification state is not verified in each piece of reported attribute information as the attribute information to be verified, so as to facilitate the subsequent verification. Therefore, each piece of reported attribute information in the service database can be recorded with verification conditions, so that the inquiry and management are convenient.
In an optional implementation manner of this embodiment, before obtaining the attribute information to be verified from the service database at intervals of a set duration, the method further includes:
Acquiring batch data reported by a service module, wherein the batch data is corresponding quantity of incremental data determined by the service module based on incremental update points and offset;
and determining the acquisition attribute information of the batch data, and writing the acquisition attribute information into a cache database.
It should be noted that, after the data collector receives the batch data reported by the service module, the batch metadata may also be written into the message middleware, and then the data collection server may obtain the batch data reported by the service module from the message middleware through the data obtaining thread, determine the obtaining attribute information of the batch data, and write the obtaining attribute information into the cache database.
The message middleware is a component for temporarily storing batch data reported by the service module, so that a large amount of data is prevented from being backlogged on the data collector, and can be a message queue, a kafka (a high-throughput distributed publishing and subscribing message system which can process all action stream data of a consumer in a website) or other intermediate components which can store the data.
In addition, the acquired attribute information is related data when the data acquisition server acquires batch data, the acquired attribute information corresponds to the reporting attribute information, namely if the reporting attribute information comprises reporting data quantity, the acquired attribute information comprises acquired data quantity, so that the reporting attribute information and the acquired attribute information are conveniently compared later, and whether the acquired data of the data acquisition server is lost, damaged and other anomalies are determined.
Furthermore, the cache database may refer to a database capable of temporarily storing information, and in this embodiment, the cache database may be a Redis database.
In the embodiment of the application, after the data acquisition server acquires the batch data reported by the service module, the acquisition attribute information of the batch data can be determined, and the acquisition attribute information is written into the cache database, so that the corresponding acquisition attribute information can be acquired from the cache database later, the reported attribute information is checked, the quality of the acquired data is ensured, and the data integrity and availability are improved.
In addition, in practical application, after the data acquisition server acquires the batch data reported by the service module, the acquired batch data can be written into the service database in addition to determining the acquired attribute information of the batch data and writing the acquired attribute information into the cache database; that is, the service database may store data obtained from each service module for unified query and management by the user, so as to reduce the inter-call between each service system in the platform.
In an optional implementation manner of this embodiment, determining the acquisition attribute information of the batch data includes:
Bitmap operation is carried out on batch data, and corresponding acquired attribute information is obtained.
It should be noted that, each lot data has a unique lot identifier, and bitmap calculation is performed on one lot data, so that corresponding acquired attribute information, that is, a data amount of one lot data is represented in a bitmap form, can be obtained.
In the embodiment of the application, for one batch of data, the data volume of each batch is counted through the bitmap algorithm, and even if the message middleware is abnormal, the abnormal repeated acquisition of the data does not interfere with the verification of the integrity of the data volume of each batch, so that the accuracy of data verification is ensured.
In an optional implementation manner of this embodiment, bitmap operation is performed on batch data to obtain corresponding acquired attribute information, including:
acquiring initial bitmap information, wherein each element in the initial bitmap information is a first numerical value;
determining a corresponding target position of the data in the initial bitmap information for each data in the batch data;
and setting the element at the target position to a second value from the first value, obtaining the acquired attribute information corresponding to the batch data, and carrying the batch identification of the batch data in the acquired attribute information.
Specifically, the first value is an initial value in the initial bitmap information, the second value is a value indicating that there is data at the corresponding position, the first value may be 0, and the second value may be 1.
It should be noted that, the initial bitmap information includes a plurality of elements, each element is a first value, for each data in one lot of data, a target position corresponding to the data in the initial bitmap information may be determined, the element at the target position is set from the first value to a second value, the acquired attribute information corresponding to the lot of data is obtained, and the acquired attribute information carries a lot identifier of the lot of data.
For example, the initial bitmap information is [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], and assuming that 10 pieces of data are included in one batch of data, the corresponding target positions in the initial bitmap information are divided into 3 rd, 4 th, 5 th, 7 th, 8 th, 10 th, 12 th, 13 th, 15 th, 16 th, and the corresponding acquisition attribute information can be obtained at this time as [0,0,1,1,1,0,1,1,0,1,0,1,1,0,1,1,0,0,0].
In the embodiment of the application, how many elements in the obtained acquired attribute information are the second numerical value indicates how many data are included in the batch data, and the numerical value of the elements in the bitmap information is used for representing the data volume of the batch data, so that even if the message middleware is abnormal, the abnormal condition that the data is repeatedly acquired does not interfere with the verification of the integrity of each batch of data volume, and the accuracy of data verification is ensured.
Step 104: and verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when the data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified.
It should be noted that, after the data acquisition server acquires the batch data reported by the service module, the data acquisition server can determine the acquisition attribute information of the batch data and record the acquisition attribute information in the cache database, so that the data quality verification thread of the data acquisition server acquires each attribute information to be verified from the service database, and for each attribute information to be verified, can acquire the corresponding target acquisition attribute information from the cache database, and then verify the attribute information to be verified based on the target acquisition attribute information to obtain a verification result.
In practical application, each attribute information may carry a corresponding lot identifier, which is used for recording which lot data the attribute information corresponds to. Therefore, after the attribute information to be checked is acquired, the batch identifier of the attribute information to be checked can be acquired, and based on the batch identifier, the corresponding target acquired attribute information is acquired in the cache database, that is, the batch identifier of the target acquired attribute information is the same as the batch identifier of the attribute information to be checked, which indicates that the same batch data corresponds to, and whether the batch data is abnormal during acquisition can be determined based on the check result.
In an optional implementation manner of this embodiment, the attribute information to be checked includes a reported data amount, and the target acquired attribute information includes an acquired data amount; based on the target acquired attribute information acquired from the cache database, verifying the attribute information to be verified, including:
comparing the reported data volume with the acquired data volume;
if the reported data volume is consistent with the acquired data volume, determining that the attribute information to be checked passes the check;
if the reported data volume is inconsistent with the acquired data volume, determining that the attribute information to be checked fails to be checked.
It should be noted that, the attribute information to be checked includes a reported data amount, the target acquired attribute information includes an acquired data amount, the reported data amount and the acquired data amount are compared, if the reported data amount is consistent with the acquired data amount, the data amount of the batch data reported by the service module is the same as the data amount of the batch data acquired by the data acquisition server, the data acquisition server does not lose data when acquiring the batch data reported by the service module, the acquired batch data is complete, and at this time, the attribute information to be checked can be determined to pass the check.
If the reported data volume is inconsistent with the acquired data volume, the data volume of the batch data reported by the service module is different from the data volume of the batch data acquired by the data acquisition server, the data acquisition server may lose the data when acquiring the batch data reported by the service module, the acquired batch data is incomplete, and at the moment, the to-be-checked attribute information can be determined to be not checked.
In the embodiment of the invention, the attribute information to be checked can be checked based on the corresponding target acquisition attribute information acquired from the cache database, and once the inconsistency of the reported data quantity and the acquired data quantity is found, the data acquisition server can timely and actively sense, so that the data acquisition server can timely sense the abnormal data loss occurring in the process of acquiring the target batch data, the user discovery is not needed, and the integrity, the accuracy and the high availability of the data are ensured.
Step 106: and updating the verification state of the attribute information to be verified in the service database according to the verification result.
It should be noted that, because each piece of reported attribute information stored in the service database includes a corresponding verification state, the data acquisition server may update the verification state of the attribute information to be verified in the service database according to the verification result after determining the verification result.
In an optional implementation manner of this embodiment, updating the verification state of the attribute information to be verified in the service database according to the verification result includes:
updating the checking state of the attribute information to be checked in the service database from non-checking to checked, and recording the checking result.
When the reported attribute information stored in the service database is not checked initially, the checking state is not checked, the data acquisition server performs checking, after the checking result is obtained, the checking result can be updated to checked, and the corresponding checking result is recorded as passing checking or failing checking; alternatively, the uncore update may be directly updated to the verification result, such as by updating the uncore to pass the verification, or by updating the uncore to fail the verification. Therefore, each piece of reporting attribute information stored in the business database is recorded with the corresponding verification state, so that the management is convenient, and batch data which are not verified can be processed in time.
In an optional implementation manner of this embodiment, the data collector may query the service database for abnormal reporting attribute information that fails to pass the verification, determine a start identifier and a stop identifier of the abnormal batch data from the abnormal reporting attribute information, generate a retry request according to the start identifier and the stop identifier of the abnormal batch data, and send the retry request to the service module.
It should be noted that, for the reporting attribute information that is not checked, it is noted that the data acquisition server side has an abnormality of missing data in the process of acquiring data, the data is incomplete, at this time, the data acquisition unit may determine abnormal batch data, and then send a retry request to the service module, where the retry request may carry the start identifier and the end identifier of the abnormal batch data, so that the service module may report the corresponding batch data again, the data acquisition server side may reacquire the batch data, and check the reporting attribute information of the batch data again, and if the verification is passed, the reacquired batch data may be used to cover the incomplete batch data in the service database, thereby ensuring the integrity of the data stored in the service database. If the verification is still not passed, an alarm can be given to indicate that the abnormality occurs, manual intervention is needed, and the abnormal situation is timely sensed and alarmed.
According to the data verification method, reported attribute information determined based on batch data reported by the service module can be stored in the service database, attribute information recorded by statistics when the data acquisition server acquires target batch data can be stored in the cache database, the data acquisition server can acquire attribute information to be verified from the service database every set time, then the attribute information to be verified is verified based on corresponding target acquired attribute information acquired from the cache database, once the attribute information to be verified is inconsistent with the corresponding target acquired attribute information, the data acquisition server can actively sense in time, so that the data acquisition server can sense abnormality occurring in the process of acquiring the target batch data in time, user discovery is not needed, and the integrity, accuracy and high availability of the data are ensured.
Fig. 2 shows a process flow chart of a data verification method applied to metadata, which is provided according to an embodiment of the present application, and is applied to a data verification system, where the system includes at least one service module, a metadata collector, a metadata collection server and a message middleware, and the metadata collection server is provided with a data acquisition thread and a data quality verification thread, and specifically includes the following steps:
Step 202: the metadata acquisition device sends a metadata acquisition request to the service module, wherein the metadata acquisition request carries an incremental update point and an offset, and the offset is the data volume of metadata of one batch.
Step 204: the business module determines the initial identification and the data quantity of batch metadata according to the increment updating points and the offset, acquires the batch metadata with the corresponding quantity according to the initial identification and the data quantity, and reports the batch metadata with the corresponding quantity to the metadata collector.
Step 206: the metadata collector receives the batch metadata reported by the service module, determines reporting attribute information corresponding to the batch metadata based on the batch metadata reported by the service module, and stores the reporting attribute information into the service database.
Step 208: the metadata collector writes the batch metadata into the message middleware.
Step 210: the metadata acquisition server acquires batch metadata reported by the service module from the message middleware through a data acquisition thread, determines acquisition attribute information of the batch metadata, and writes the acquisition attribute information into a cache database.
In an optional implementation manner of this embodiment, determining the acquisition attribute information of the batch metadata includes:
And carrying out bitmap operation on the batch metadata to obtain corresponding acquired attribute information.
In an optional implementation manner of this embodiment, bitmap operation is performed on the batch metadata to obtain corresponding acquired attribute information, including:
acquiring initial bitmap information, wherein each element in the initial bitmap information is a first numerical value;
determining a corresponding target position of the metadata in the initial bitmap information for each metadata in the batch metadata;
and setting the element at the target position to a second value from the first value, obtaining the acquired attribute information corresponding to the batch metadata, and carrying the batch identification of the batch metadata in the acquired attribute information.
Step 212: and the metadata acquisition server acquires attribute information to be checked from the service database at intervals of set time by a data quality checking thread, wherein the attribute information to be checked is the unchecked reported attribute information stored in the service database.
In an optional implementation manner of this embodiment, the attribute information to be checked includes a check state; obtaining attribute information to be verified from a service database, including:
traversing each reporting attribute information stored in the service database;
And taking the reporting attribute information with the checking state of each reporting attribute information being unverified as the attribute information to be checked.
Step 214, the metadata acquisition server performs verification on the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is the attribute information recorded by statistics when the metadata acquisition server acquires the target batch metadata, and the target batch metadata is the batch metadata corresponding to the attribute information to be verified.
In an optional implementation manner of this embodiment, the attribute information to be checked includes a reported data amount, and the target acquired attribute information includes an acquired data amount; based on the target acquired attribute information acquired from the cache database, verifying the attribute information to be verified, including:
comparing the reported data volume with the acquired data volume;
if the reported data volume is consistent with the acquired data volume, determining that the attribute information to be checked passes the check;
if the reported data volume is inconsistent with the acquired data volume, determining that the attribute information to be checked fails to be checked.
Step 216: and the metadata acquisition server updates the verification state of the attribute information to be verified in the service database according to the verification result.
Step 218: the metadata collector queries abnormal report attribute information which is not passed by check in the service database, determines the starting identifier and the ending identifier of the abnormal batch metadata from the abnormal report attribute information, generates a retry request according to the starting identifier and the ending identifier of the abnormal batch metadata, and sends the retry request to the service module.
According to the data verification method, reported attribute information determined based on batch metadata reported by the service module can be stored in the service database, attribute information recorded by statistics when the metadata acquisition server acquires target batch metadata can be stored in the cache database, the metadata acquisition server can acquire attribute information to be verified from the service database every set time, then the attribute information to be verified is verified based on the corresponding target acquired attribute information acquired from the cache database, once the attribute information to be verified is inconsistent with the corresponding target acquired attribute information, the metadata acquisition server can timely and actively sense, so that the metadata acquisition server can timely sense abnormality occurring in the process of acquiring target batch metadata, user discovery is not needed, and the integrity, accuracy and high availability of the metadata are ensured.
Fig. 3 is an interaction schematic diagram of a data verification process according to an embodiment of the present application, where the data verification process is applied to a data verification system, and the system includes at least one service module, a metadata collector Agent, a metadata server, and a message middleware, where the message middleware is a message queue, a data collection thread is disposed in the metadata collector, and a data acquisition thread and a data quality verification thread are disposed in the metadata server, as shown in fig. 3, and the data verification process includes the following steps:
and step 1, acquiring metadata information of a specified batch through a metadata interface provided by a timing task initiative business removal module configured by a data acquisition thread in a data acquisition device.
And step 2, the data acquisition thread sends the acquired metadata information of the appointed batch to a message queue through a message reporting module.
And step 3, the data acquisition thread performs data volume statistics on the acquired metadata information of the designated batch to acquire reporting data volume of reporting metadata, and the reporting data volume is written into a service database for recording and is used in a quality check flow for comparing the reporting data volume with the acquired data volume.
And 4, the metadata server acquires metadata information of a specified batch from the message queue through a data acquisition thread, performs bitmap operation on the metadata information of the specified batch to acquire acquired data volume corresponding to the metadata information of the specified batch, namely, the data volume actually acquired by the metadata acquisition server, records metadata of the batch which has been consumed, and stores the acquired data volume into a cache database Redis.
In addition, the data acquisition thread can also store the acquired metadata information of the designated batch into a service database for subsequent inquiry and management.
And step 5, the metadata server side can regularly run a data quality check thread, and the current unchecked reported data quantity acquired and reported by the data acquisition device is inquired and obtained from the service database through the data quality check thread.
And 6, after the data quality verification thread acquires the reported data quantity, acquiring the corresponding acquired data quantity from the cache database Redis, comparing the reported data quantity with the acquired data quantity, checking whether the inconsistency problem exists, and if the inconsistency exists, performing a subsequent complement flow.
And 7, the data quality verification thread can write the verification result of each batch into the service database for recording, and then the complement flow, the data tracing and the index statistics of the data quantity can be carried out through the recording.
According to the data verification method, reported attribute information determined based on batch metadata reported by the service module can be stored in the service database, attribute information recorded by statistics when the metadata server acquires target batch metadata can be stored in the cache database, the metadata server can acquire attribute information to be verified from the service database every set time period, then the attribute information to be verified is verified based on the corresponding target acquired attribute information acquired from the cache database, once the attribute information to be verified is inconsistent with the corresponding target acquired attribute information, the metadata server can timely and actively sense the abnormality in the process of acquiring the target batch metadata, user discovery is not needed, and the integrity, accuracy and high availability of the metadata are ensured.
Corresponding to the method embodiment, the present application further provides a data acquisition server embodiment, and fig. 4 shows a schematic structural diagram of a data acquisition server according to an embodiment of the present application. As shown in fig. 4, the data collection server includes:
the acquisition module 402 is configured to acquire attribute information to be checked from the service database at intervals of a set time length, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database, and the reported attribute information is determined based on batch data reported by the service module;
The verification module 404 is configured to verify the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, where the target acquisition attribute information is attribute information recorded by statistics when the data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified;
and the updating module 406 is configured to update the verification state of the attribute information to be verified in the service database according to the verification result.
Optionally, the data acquisition server further includes a writing module configured to:
acquiring batch data reported by a service module, wherein the batch data is corresponding quantity of incremental data determined by the service module based on incremental update points and offset;
and determining the acquisition attribute information of the batch data, and writing the acquisition attribute information into a cache database.
Optionally, the writing module is further configured to:
bitmap operation is carried out on batch data, and corresponding acquired attribute information is obtained.
Optionally, the writing module is further configured to:
acquiring initial bitmap information, wherein each element in the initial bitmap information is a first numerical value;
determining a corresponding target position of the data in the initial bitmap information for each data in the batch data;
And setting the element at the target position to a second value from the first value, obtaining the acquired attribute information corresponding to the batch data, and carrying the batch identification of the batch data in the acquired attribute information.
Optionally, the attribute information to be verified includes a verification state; the acquisition module 402 is further configured to:
traversing each reporting attribute information stored in the service database;
and taking the reporting attribute information with the checking state of each reporting attribute information being unverified as the attribute information to be checked.
Optionally, the attribute information to be checked includes reported data quantity, and the target acquisition attribute information includes acquired data quantity; a verification module 404 further configured to:
comparing the reported data volume with the acquired data volume;
if the reported data volume is consistent with the acquired data volume, determining that the attribute information to be checked passes the check;
if the reported data volume is inconsistent with the acquired data volume, determining that the attribute information to be checked fails to be checked.
According to the data acquisition server, the attribute information to be checked can be acquired from the service database at intervals of set time, then the attribute information to be checked is checked based on the corresponding target acquired attribute information acquired from the cache database, once the attribute information to be checked is inconsistent with the corresponding target acquired attribute information, the data acquisition server can timely and actively sense, so that the data acquisition server can timely sense the abnormality occurring in the process of acquiring the target batch data, user discovery is not needed, and the integrity, accuracy and high availability of the data are guaranteed.
The foregoing is a schematic solution of a data acquisition server in this embodiment. It should be noted that, the technical solution of the data acquisition server side and the technical solution of the data verification method belong to the same concept, and details of the technical solution of the data acquisition server side, which are not described in detail, can be referred to the description of the technical solution of the data verification method.
Corresponding to the above method embodiment, the present application further provides a data verification system embodiment, and fig. 5 shows a schematic structural diagram of a data verification system provided in an embodiment of the present application. As shown in fig. 5, the data verification system includes at least one service module 502, a data collector 504, and a data collection server 506;
the data collector 504 is configured to determine reporting attribute information corresponding to the batch data based on the batch data reported by the service module 502, and store the reporting attribute information into the service database;
the data acquisition server 506 is configured to acquire attribute information to be checked from the service database at intervals of a set time, wherein the attribute information to be checked is non-checked reporting attribute information stored in the service database; verifying the attribute information to be verified based on the target acquisition attribute information acquired from the cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when a data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified; and updating the verification state of the attribute information to be verified in the service database according to the verification result.
Optionally, the data collector 504 is further configured to send a data acquisition request to the service module 502, where the data acquisition request carries an incremental update point and an offset, and the offset is a batch of data;
a business module 502 configured to determine a start identification and a data amount of the batch data according to the incremental update points and the offset; according to the initial identification and the data quantity, obtaining batch data with corresponding quantity, and reporting the batch data with corresponding quantity to a data collector;
the data collector 504 is further configured to receive batch data reported by the service module 502.
Optionally, the data acquisition server 506 is further configured to:
acquiring batch data reported by a service module;
and determining the acquisition attribute information of the batch data, and writing the acquisition attribute information into a cache database.
Optionally, the system further comprises message middleware;
a data collector 504 further configured to write batch data into the message middleware;
the data collection server 506 is further configured to obtain the batch data reported by the service module from the message middleware.
Optionally, a data acquisition thread and a data quality verification thread are provided in the data acquisition server 506; the data acquisition server 506 is further configured to:
Acquiring batch data reported by a service module from a message middleware through a data acquisition thread;
and acquiring attribute information to be checked from the service database at intervals of set time by a data quality checking thread, checking the attribute information to be checked based on the acquired target acquired attribute information from the cache database, and updating the checking state of the attribute information to be checked in the service database according to the checking result.
Optionally, the data collector 504 is further configured to:
inquiring and checking abnormal reporting attribute information which fails to pass from a service database;
determining a start identifier and a termination identifier of abnormal batch data from the abnormal reporting attribute information;
generating a retry request according to the start identifier and the end identifier of the abnormal batch data, and sending the retry request to the service module.
According to the data verification system, reported attribute information determined based on batch data reported by the service module can be stored in the service database, attribute information recorded by statistics when the data acquisition server acquires target batch data can be stored in the cache database, the data acquisition server can acquire attribute information to be verified from the service database every set time, then the attribute information to be verified is verified based on corresponding target acquired attribute information acquired from the cache database, once the attribute information to be verified is inconsistent with the corresponding target acquired attribute information, the data acquisition server can actively sense in time, so that the data acquisition server can sense abnormality occurring in the process of acquiring the target batch data in time, user discovery is not needed, and the integrity, accuracy and high availability of the data are ensured.
The foregoing is a schematic scheme of a data verification system of this embodiment. It should be noted that, the technical solution of the data verification system and the technical solution of the data verification method belong to the same conception, and details of the technical solution of the data verification system which are not described in detail can be referred to the description of the technical solution of the data verification method.
FIG. 6 illustrates a block diagram of a computing device provided in accordance with an embodiment of the present application. The components of computing device 600 include, but are not limited to, memory 610 and processor 620. The processor 620 is coupled to the memory 610 via a bus 630 and a database 650 is used to hold data.
Computing device 600 also includes access device 640, access device 640 enabling computing device 600 to communicate via one or more networks 660. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local AreaNetwork), wide Area Networks (WAN), personal area networks (PAN, personal AreaNetwork), or combinations of communication networks such as the internet. The access device 640 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network Interface Controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local AreaNetworks) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability forMicrowave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present application, the above-described components of computing device 600, as well as other components not shown in FIG. 6, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 6 is for exemplary purposes only and is not intended to limit the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 600 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 600 may also be a mobile or stationary server.
The processor 620 is configured to execute the following computer executable instructions to implement the operation steps of the data verification method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the data verification method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the data verification method.
An embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, perform the operational steps of the data verification method described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the data verification method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the data verification method.
The foregoing describes specific embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code which may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all necessary for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The above-disclosed preferred embodiments of the present application are provided only as an aid to the elucidation of the present application. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of this application. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This application is to be limited only by the claims and the full scope and equivalents thereof.

Claims (15)

1. The data verification method is characterized by being applied to a data acquisition server, and comprises the following steps:
acquiring attribute information to be checked from a service database at intervals of set time, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database, and the reported attribute information is determined based on batch data reported by a service module;
verifying the attribute information to be verified based on target acquisition attribute information acquired from a cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when the data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified;
and updating the verification state of the attribute information to be verified in the service database according to the verification result.
2. The method for verifying data according to claim 1, wherein before obtaining the attribute information to be verified from the service database every set period of time, the method further comprises:
acquiring batch data reported by the service module, wherein the batch data is corresponding quantity of incremental data determined by the service module based on incremental update points and offset;
And determining the acquisition attribute information of the batch data, and writing the acquisition attribute information into the cache database.
3. The data verification method according to claim 2, wherein the determining the acquisition attribute information of the lot data includes:
and carrying out bitmap operation on the batch data to obtain corresponding acquired attribute information.
4. The data verification method according to claim 3, wherein the bitmap operation is performed on the batch data to obtain corresponding acquired attribute information, including:
acquiring initial bitmap information, wherein each element in the initial bitmap information is a first numerical value;
determining a corresponding target position of the data in the initial bitmap information for each data in the batch data;
and setting the element at the target position to a second value from the first value, obtaining the acquisition attribute information corresponding to the batch data, and carrying the batch identification of the batch data in the acquisition attribute information.
5. The data verification method according to any one of claims 1 to 4, wherein the attribute information to be verified includes a verification state; the obtaining the attribute information to be verified from the service database comprises the following steps:
Traversing each reporting attribute information stored in the service database;
and taking the reporting attribute information with the checking state of each reporting attribute information being unverified as the attribute information to be checked.
6. The method for verifying data according to any one of claims 1 to 4, wherein the attribute information to be verified includes a reported data amount, and the target acquisition attribute information includes an acquired data amount; the verifying the attribute information to be verified based on the target acquired attribute information acquired from the cache database comprises:
comparing the reported data volume with the acquired data volume;
if the reported data volume is consistent with the acquired data volume, determining that the attribute information to be checked passes the check;
and if the reported data volume is inconsistent with the acquired data volume, determining that the attribute information to be checked is not checked.
7. The data acquisition server side is characterized by comprising:
the acquisition module is configured to acquire attribute information to be checked from a service database at intervals of set time, wherein the attribute information to be checked is unchecked reporting attribute information stored in the service database, and the reporting attribute information is determined based on batch data reported by the service module;
The verification module is configured to verify the attribute information to be verified based on the target acquired attribute information acquired from the cache database, wherein the target acquired attribute information is the attribute information which is statistically recorded when the data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified;
and the updating module is configured to update the verification state of the attribute information to be verified in the service database according to the verification result.
8. The data verification system is characterized by comprising at least one service module, a data collector and a data collection server;
the data collector is configured to determine reporting attribute information corresponding to batch data based on the batch data reported by the service module, and store the reporting attribute information into a service database;
the data acquisition server is configured to acquire attribute information to be checked from the service database at intervals of set time, wherein the attribute information to be checked is unchecked reported attribute information stored in the service database; verifying the attribute information to be verified based on target acquisition attribute information acquired from a cache database, wherein the target acquisition attribute information is attribute information recorded in statistics when the data acquisition server acquires target batch data, and the target batch data is batch data corresponding to the attribute information to be verified; and updating the verification state of the attribute information to be verified in the service database according to the verification result.
9. The data verification system of claim 8, wherein,
the data collector is further configured to send a data acquisition request to the service module, wherein the data acquisition request carries an incremental update point and an offset, and the offset is the data volume of one batch;
the business module is configured to determine the initial identification and the data quantity of the batch data according to the increment updating point and the offset; acquiring batch data with corresponding quantity according to the initial identification and the data quantity, and reporting the batch data with the corresponding quantity to the data collector;
the data collector is further configured to receive batch data reported by the service module.
10. The data verification system of claim 9, wherein the data acquisition server is further configured to:
acquiring batch data reported by the service module;
and determining the acquisition attribute information of the batch data, and writing the acquisition attribute information into the cache database.
11. The data verification system of claim 10, wherein the system further comprises message middleware;
The data collector is further configured to write the batch data into the message middleware;
the data acquisition server is further configured to acquire batch data reported by the service module from the message middleware.
12. The data verification system according to claim 11, wherein a data acquisition thread and a data quality verification thread are provided in the data acquisition server; the data acquisition server is further configured to:
acquiring batch data reported by the service module from the message middleware through the data acquisition thread;
and acquiring attribute information to be checked from the service database at intervals of set time by the data quality checking thread, checking the attribute information to be checked based on the target acquired attribute information from the cache database, and updating the checking state of the attribute information to be checked in the service database according to the checking result.
13. The data verification system of any one of claims 8-12, wherein the data collector is further configured to:
inquiring abnormal reporting attribute information which fails verification from the service database;
Determining a start identifier and a termination identifier of abnormal batch data from the abnormal report attribute information;
generating a retry request according to the start identifier and the end identifier of the abnormal batch data, and sending the retry request to the service module.
14. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions to perform the operational steps of the data verification method of any one of the preceding claims 1-6.
15. A computer readable storage medium, characterized in that it stores computer executable instructions which, when executed by a processor, implement the operational steps of the data verification method of any one of the preceding claims 1-6.
CN202211698617.6A 2022-12-28 2022-12-28 Data verification method, data acquisition server and data verification system Pending CN116010388A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211698617.6A CN116010388A (en) 2022-12-28 2022-12-28 Data verification method, data acquisition server and data verification system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211698617.6A CN116010388A (en) 2022-12-28 2022-12-28 Data verification method, data acquisition server and data verification system

Publications (1)

Publication Number Publication Date
CN116010388A true CN116010388A (en) 2023-04-25

Family

ID=86018946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211698617.6A Pending CN116010388A (en) 2022-12-28 2022-12-28 Data verification method, data acquisition server and data verification system

Country Status (1)

Country Link
CN (1) CN116010388A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117009111A (en) * 2023-08-30 2023-11-07 上海南洋宏优智能科技有限公司 Data processing method, device, equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117009111A (en) * 2023-08-30 2023-11-07 上海南洋宏优智能科技有限公司 Data processing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN112416708B (en) Asynchronous call link monitoring method and system
CN111400288A (en) Data quality inspection method and system
CN116010388A (en) Data verification method, data acquisition server and data verification system
CN109409948B (en) Transaction abnormity detection method, device, equipment and computer readable storage medium
CN114385378A (en) Active data processing method and device for Internet of things equipment and storage medium
CN107515864B (en) Method and equipment for monitoring workflow
CN111782901A (en) Data acquisition method and device
CN111324583B (en) Service log classification method and device
CN113918636B (en) ETL-based data throughput analysis method
CN113055490B (en) Data storage method and device
CN109901950A (en) A kind of method and device for evading application crash
CN115525392A (en) Container monitoring method and device, electronic equipment and storage medium
CN112506886B (en) Multi-source service operation log acquisition method and system
CN113407491A (en) Data processing method and device
CN115705259A (en) Fault processing method, related device and storage medium
CN114884844B (en) Flow recording method and system
CN111143280B (en) Data scheduling method, system, device and storage medium
CN117289143B (en) Fault prediction method, device, equipment, system and medium
CN110191026B (en) Distributed service link monitoring method and device
CN114356490B (en) Financial information visualization processing method and system based on big data
CN117240925B (en) Flow recording method and device, storage medium and computer equipment
CN114036179A (en) Processing method and device for slow query operation
CN109684158A (en) Method for monitoring state, device, equipment and the storage medium of distributed coordination system
CN115827199A (en) Data scheduling method, device, equipment and medium based on graph database
CN115834332A (en) Fault processing method, server and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination