CN110399428A - A kind of data verification method, device and electronic equipment - Google Patents
A kind of data verification method, device and electronic equipment Download PDFInfo
- Publication number
- CN110399428A CN110399428A CN201910684893.9A CN201910684893A CN110399428A CN 110399428 A CN110399428 A CN 110399428A CN 201910684893 A CN201910684893 A CN 201910684893A CN 110399428 A CN110399428 A CN 110399428A
- Authority
- CN
- China
- Prior art keywords
- data
- field
- value
- probability
- sample data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/273—Asynchronous replication or reconciliation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention relates to a kind of data verification method, device and electronic equipments, this method comprises: obtaining default check field, the probability that the default check field mistake causes data that mistake occurs meets preset condition;It obtains in the second database from the data to be verified that first database is synchronous;The default check field in the data to be verified is verified.The default check field that the technical solution is only treated in verification data is verified, and the time-consuming higher problem of full word section verification is avoided.And since default check field is to lead to the higher field of error in data probability, these fields are verified, can more quickly, more targetedly find wrong data, improve the efficiency of data check.In addition, also effectively further reducing the access frequency to database.
Description
Technical field
The present invention relates to data processing field more particularly to a kind of data verification methods, device and electronic equipment.
Background technique
When synchrodata among a plurality of databases, it is related to Data Consistency.Although in synchronization system
It joined the technology for guaranteeing consistency, but in practice there is still a need for verifying to data consistency, find that data are asked in advance
Topic.
In the past, since number of users is fewer, data volume is relatively low, can using daily be timed to data into
Row full word section scanned for checkout.However, maintaining sustained and rapid growth with user volume, original full word section method of calibration will lead to verification
Efficiency is lower, takes long time, and increases the access pressure of database.
Summary of the invention
In order to solve the above-mentioned technical problem or it at least is partially solved above-mentioned technical problem, the present invention provides a kind of numbers
According to method of calibration, device and electronic equipment.
In a first aspect, the present invention provides a kind of data verification methods, comprising:
Default check field is obtained, the probability that the default check field mistake causes data that mistake occurs meets default item
Part;
It obtains in the second database from the data to be verified that first database is synchronous;
The default check field in the data to be verified is verified.
Optionally, before obtaining default check field, the method also includes:
The quantity for obtaining sample data is the first numerical value;
Error sample data are extracted from the sample data, the error sample data are the sample number that mistake occurs
According to the quantity of the error sample data is second value;
The error sample data are analyzed, determine the field that mistake occurs;
The quantity for counting the corresponding error sample data of the field is third value;
Calculating the field according to first numerical value, second value and third value causes sample data that mistake occurs
Probability;
Judge whether the probability meets the preset condition, the probability meet the preset condition include it is following at least
A kind of situation: the probability is greater than or equal to predetermined probabilities threshold value, and after the probability is sorted from large to small, and first default
Several probability;
The corresponding field of the probability for meeting the preset condition is set as default check field.
Optionally, calculating the field according to first numerical value, second value and third value causes sample data to be sent out
The probability of raw mistake, comprising:
Error sample data are calculated shared first in all sample datas according to first numerical value and second value
Ratio;
The corresponding error sample data of the field are calculated in the wrong sample of institute according to the second value and third value
The second shared ratio in notebook data;
The corresponding error sample data of the field are calculated in all sample numbers according to first numerical value and third value
The shared third ratio in;
Based on NB Algorithm is improved, calculated according to first ratio, the second ratio and third ratio described general
Rate are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate described first
Ratio, P (a) expression second ratio, P (B | a) indicate the third ratio.
Optionally, calculating the field according to first numerical value, second value and third value causes sample data to be sent out
The probability of raw mistake, comprising:
Obtain the corresponding checkout coefficient of the field;
Calculating the field according to first numerical value, second value, third value and the checkout coefficient leads to sample
The probability of mistake occurs for data are as follows:
Wherein, h indicates the checkout coefficient.
Optionally, the method also includes:
When check results are the default check field mistake, the Data Identification of the data to be verified is obtained;
The corresponding initial data of the Data Identification is searched from the first database;
It is modified according to the default check field of the initial data to the data to be verified.
Second aspect, the present invention provides a kind of data calibration devices, comprising:
Field obtains module, and for obtaining default check field, it is wrong that the default check field mistake causes data to occur
Probability accidentally meets preset condition;
Data acquisition module, for obtaining in the second database from the data to be verified that first database is synchronous;
Correction verification module, for being verified to the default check field in the data to be verified.
Optionally, described device further include:
Quantity obtains module, and the quantity for before obtaining default check field, obtaining sample data is the first numerical value;
Extraction module, for extracting error sample data from the sample data, the error sample data are to occur
The sample data of mistake, the quantity of the error sample data are second value;
Analysis module determines the field that mistake occurs for analyzing the error sample data;
Statistical module, the quantity for counting the corresponding error sample data of the field are third value;
Computing module leads to sample for calculating the field according to first numerical value, second value and third value
The probability of mistake occurs for data;
Judgment module, for judging whether the probability meets the preset condition, the probability meets the default item
Part comprises at least one of the following situation: the probability is greater than or equal to predetermined probabilities threshold value, and from big to small by the probability
After sequence, the probability of preceding predetermined number;
Setting module, for the corresponding field of the probability for meeting the preset condition to be set as the default check word
Section.
Optionally, the computing module is specifically used for calculating error sample number according to first numerical value and second value
According to shared the first ratio in all sample datas;It is corresponding that the field is calculated according to the second value and third value
Error sample data the second ratio shared in all error sample data;It is calculated according to first numerical value and third value
The corresponding error sample data of field third ratio shared in all sample datas;It is calculated based on naive Bayesian is improved
Method calculates the probability according to first ratio, the second ratio and third ratio are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate described first
Ratio, P (a) expression second ratio, P (B | a) indicate the third ratio.
Optionally, the computing module is also used to obtain the corresponding checkout coefficient of the field;According to first number
Value, second value, third value and the checkout coefficient calculate the probability that the field causes sample data that mistake occurs are as follows:
Wherein, h indicates the checkout coefficient.
Optionally, described device further include:
Identifier acquisition module, for obtaining the number to be verified when check results are the default check field mistake
According to Data Identification;
Searching module, for searching the corresponding initial data of the Data Identification from the first database;
Correction module, for being repaired according to the default check field of the initial data to the data to be verified
Just.
The third aspect, the present invention provides a kind of electronic equipment, comprising: processor, communication interface, memory and communication are total
Line, wherein processor, communication interface, memory complete mutual communication by communication bus;
The memory, for storing computer program;
The processor when for executing computer program, realizes above method step.
Above-mentioned technical proposal provided in an embodiment of the present invention has the advantages that compared with prior art
The default check field only treated in verification data is verified, and the time-consuming higher problem of full word section verification is avoided.
And since default check field is to lead to the higher field of error in data probability, these fields are verified, it can be more
Quickly, wrong data is more targetedly found, the efficiency of data check is improved.In addition, also effectively further reducing pair
The access frequency of database.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows and meets implementation of the invention
Example, and be used to explain the principle of the present invention together with specification.
Fig. 1 is a kind of flow chart of data verification method provided in an embodiment of the present invention;
Fig. 2 be another embodiment of the present invention provides a kind of data verification method flow chart;
Fig. 3 be another embodiment of the present invention provides a kind of data verification method flow chart;
Fig. 4 be another embodiment of the present invention provides a kind of data verification method flow chart;
Fig. 5 is a kind of block diagram of data calibration device provided in an embodiment of the present invention;
Fig. 6 be another embodiment of the present invention provides a kind of data calibration device block diagram;
Fig. 7 be another embodiment of the present invention provides a kind of data calibration device block diagram;
Fig. 8 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.
Full word section verification in the prior art, for huge data volume, takes a long time.Also, due to error number
It is smaller according to being accounted in total data, i.e., most of data be all correctly, full word section verify that efficiency is lower and meaning not
Greatly.
The technical solution of the embodiment of the present invention, the probability selection for causing data that mistake occurs previously according to field errors are default
Check field only verifies the data of default check field in data check.
A kind of data verification method is provided for the embodiments of the invention first below to be introduced.
Method provided by the embodiment of the present invention can be applied to any electronic equipment for needing data check, for example, can
Think the electronic equipments such as server, terminal, is not specifically limited herein, for convenience of description, subsequent referred to as electronic equipment.
Fig. 1 is a kind of flow chart of data verification method provided in an embodiment of the present invention.As shown in Figure 1, this method includes
Following steps:
Step S11 obtains default check field, and it is pre- that default check field mistake causes the probability of data generation mistake to meet
If condition.
The preset condition packet can be probability more than or equal to predetermined probabilities threshold value, or probability is sorted from large to small
Afterwards, the probability of preceding predetermined number.
Step S12 is obtained in the second database from the data to be verified that first database is synchronous.
Step S13, the default check field treated in verification data are verified.
When data are synchronized to the second database from first database, in fact it could happen that synchronize in rear second database
The inconsistent situation of data in mistake, with first database occurs for data.Therefore, it is necessary in the second database to synchronizing
Data verified.When verification, only default check field can be verified.
For example, user data includes many fields, such as user identifier, phone number, log-on message, account information, certification
State, historical record etc..Wherein, account information field and authentication state field are the probability for causing user data that mistake occurs
Relatively high field, therefore, default check field can be set to account information field and authentication state field.Only to every user
Account information field and authentication state field in data are verified.
Wherein, the verification of default check field can be realized by comparing data corresponding in two databases.It is optional
, check results can also be sent to the terminal of database maintenance personnel, so as to maintenance personnel can according to check results into
Row data correction.
In the embodiment of the present invention, the default check field only treated in verification data is verified, and full word section is avoided to verify
Time-consuming higher problem.And since default check field is to lead to the higher field of error in data probability, to these fields into
Row verification, can more quickly, more targetedly find wrong data, improve the efficiency of data check.In addition, also further
Significantly reduce the access frequency to database.
Fig. 2 be another embodiment of the present invention provides a kind of data verification method flow chart.As shown in Fig. 2, obtaining
Before default check field, this method further includes the process that analysis determines default check field, can be realized by following steps:
Step S21, the quantity for obtaining sample data is the first numerical value.
Wherein, sample data can be the historical data in database.
Step S22 extracts error sample data from sample data, wherein error sample data are the sample that mistake occurs
Notebook data, the quantity of error sample data are second value.Step S23 analyzes error sample data, determines the word that mistake occurs
Section.
Step S24, the quantity of the corresponding error sample data of static fields are third value.
Step S25 causes sample data that mistake occurs according to the first numerical value, second value and third value calculated field
Probability.
When calculating probability, it can be calculated using bayesian algorithm is improved.
Step S26, judges whether probability meets preset condition, if so, step S27 is executed, if not, terminating.
The probability meets preset condition and comprises at least one of the following situation: probability is greater than or equal to predetermined probabilities threshold value, with
And after probability is sorted from large to small, the probability of preceding predetermined number.
The corresponding field of the probability for meeting preset condition is set as default check field by step S27.
Wherein, the process of step S25 calculating probability is as follows:
Step A1 calculates error sample data shared the in all sample datas according to the first numerical value and second value
One ratio;
Step A2, according to second value and the corresponding error sample data of third value calculated field in all error samples
The second shared ratio in data;
Step A3, according to the first numerical value and the corresponding error sample data of third value calculated field in all sample datas
In shared third ratio;
Step A4 calculates probability according to the first ratio, the second ratio and third ratio based on NB Algorithm is improved
Are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate described first
Ratio, P (a) expression second ratio, P (B | a) indicate the third ratio.
Above-mentioned steps A1, A2, A3 sequentially, can also be carried out simultaneously in no particular order.
The calculating process of probability is illustrated with a specific example below.
For example, the quantity of sample data is 10000, the error sample data for receiving report barrier record are 100.By analyzing,
The field that mistake occurs is respectively a1, a2, a3.Wherein, the corresponding error sample data of a1 are the corresponding error sample number of 20, a2
It is 50 according to for the corresponding error sample data of 30, a3.
Calculate error sample data ratio shared in all sample datas are as follows: P (B)=1%.
It calculates the corresponding error sample data of each field ratio shared in all error sample data and is respectively as follows: P
(B | a1)=20%, P (B | a2)=30%, P (B | a3)=50%.
It calculates the corresponding error sample data of each field ratio shared in all sample datas and is respectively as follows: P (a1)
=0.2%, P (a2)=0.3%, P (a3)=0.5%.
Therefore, based on improvement NB AlgorithmIt calculates and mistake occurs
The probability that field causes sample data that mistake occurs is respectively as follows:
Further, step S25 can also include: to obtain the corresponding checkout coefficient of field;According to the first numerical value, the second number
Value, third value and checkout coefficient calculated field cause sample data that wrong probability occurs are as follows:
Wherein, h indicates checkout coefficient.
Checkout coefficient is the coefficient previously according to the attention rate setting to field, the size of checkout coefficient and the pass to field
Note degree is directly proportional, i.e., attention rate is higher, and checkout coefficient numerical value is bigger.For example, its checkout coefficient is set for the field being not concerned with
It is 0;For the not high field of attention rate, the number its checkout coefficient being set as between 0~1;Field higher for attention rate, will
Its checkout coefficient is set as the number greater than 1.
In addition, the attention rate to field is also related with the importance of field and its influence to system business, for example, In
It is more important to the field for embodying user authentication status in user data, once malfunctioning, it is affected to system business, because
The checkout coefficient of the field can be set as being greater than 1 by this.
For example, in above-mentioned specific example, these three fields of a1, a2, a3 corresponding checkout coefficient h1=0, h2=0.8, h3
=1.5.
The probability that then these three fields cause sample data that mistake occurs is respectively as follows:
It can be seen that field a1 is not significant field, the influence to malfunction to system business is very low, therefore, not to its into
Row verification.When selecting default check field, it can choose a2 and a3 the two fields as the field verified.
It is time-consuming in order to further decrease verification in another embodiment of the present invention, the data volume of data to be verified can be set
Threshold value, for example, only verifying 2,000,000 datas daily.Above-mentioned steps S12 includes: to obtain the first preset threshold;From second number
According to the data to be verified selected in library from synchronous first preset threshold of first database.
Fig. 3 be another embodiment of the present invention provides a kind of data verification method flow chart.As shown in figure 3, the data
Method of calibration the following steps are included:
Step S31 takes out data to be verified according to default check field from the second database.
Since data not all in database all have the default check field, extracting number to be verified
According to when, can choose the data that check field is preset with this.Further, since default check field is modified, illustrate it
A possibility that mistake occurs is bigger, therefore can choose data that default check field is modified as data to be verified.
Step S32 judges the data volume of data to be verified whether more than the first preset threshold, if so, executing step
S33, if not, executing step S34.
The data to be verified currently verified are truncated in step S33.
Step S34, the default check field for treating verification data are verified.
Step S35, output verification result.
In addition, one field of every verification is needed to data scanning one time since default check field can have multiple, that
2 fields are verified, then need to check data 2 times.The data volume actually verified is equivalent to the quantity of data to be verified and pre-
If the product of the number of check field.
In another embodiment of the present invention, the data-quantity threshold of verification data can also be set.Above-mentioned steps S13 includes:
Obtain the second preset threshold;
Statistics verification data volume, when the number of the default check field is greater than 1, one default check word of every verification
Section, the verification data volume add 1;
When the verification data volume is greater than second preset threshold, stop verification.
For example, the second preset threshold of setting is 5,000,000.One default check field of every verification, verification data volume add 1.When
When verifying some default check field of some data, the data volume for verifying data reaches 5,000,000, then carries out truncation,
Stop subsequent verification.
In the present embodiment, limitation carries out verification data volume, avoid a large amount of data check to the occupancy of system resource and
Processing of the influence system to other work.
In another embodiment, data can be corrected mistake automatically.Fig. 4 be another embodiment of the present invention provides a kind of number
According to the flow chart of method of calibration.As shown in figure 4, this method is further comprising the steps of:
Step S41 obtains the Data Identification of data to be verified when check results are default check field mistake;
Step S42 searches the corresponding initial data of Data Identification from first database;
Step S43 is modified according to the default check field that initial data treats verification data.
In the present embodiment, wrong data is modified automatically, without manually modifying, data correction effect can be improved
Rate reduces manual operation.
Following is embodiment of the present disclosure, can be used for executing embodiments of the present disclosure.
Fig. 5 is a kind of block diagram of data calibration device provided in an embodiment of the present invention, which can pass through software, hardware
Or both be implemented in combination with as some or all of of electronic equipment.As shown in figure 5, the data calibration device includes:
Field obtains module 501, and for obtaining default check field, default check field mistake causes data that mistake occurs
Probability meet preset condition;
Data acquisition module 502, for obtaining in the second database from the data to be verified that first database is synchronous;
Correction verification module 503 is verified for treating the default check field in verification data.
Fig. 6 be another embodiment of the present invention provides the block diagram of data calibration device a kind of should as shown in fig. 6, optionally
Device further include:
Quantity obtains module 504, and the quantity for before obtaining default check field, obtaining sample data is the first number
Value;
Extraction module 505, for extracting error sample data from the sample data, the error sample data are hair
The sample data of raw mistake, the quantity of the error sample data are second value;
Analysis module 506 determines the field that mistake occurs for analyzing the error sample data;
Statistical module 507, the quantity for counting the corresponding error sample data of the field are third value;
Computing module 508 leads to sample for calculating the field according to first numerical value, second value and third value
The probability of mistake occurs for notebook data;
Judgment module 509, for judging whether the probability meets the preset condition, the probability meets described default
Condition comprises at least one of the following situation: the probability be greater than or equal to predetermined probabilities threshold value, and by the probability from greatly to
After small sequence, the probability of preceding predetermined number;
Setting module 510, for the corresponding field of the probability for meeting the preset condition to be set as the default verification
Field.
Optionally, computing module 508 are specifically used for calculating error sample data according to first numerical value and second value
The first shared ratio in all sample datas;The corresponding mistake of the field is calculated according to the second value and third value
Accidentally sample data the second ratio shared in all error sample data;Institute is calculated according to first numerical value and third value
State the corresponding error sample data of field third ratio shared in all sample datas;It is calculated based on naive Bayesian is improved
Method calculates the probability according to first ratio, the second ratio and third ratio are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate described first
Ratio, P (a) expression second ratio, P (B | a) indicate the third ratio.
Optionally, computing module 508 are also used to obtain the corresponding checkout coefficient of the field;According to first numerical value,
Second value, third value and the checkout coefficient calculate the probability that the field causes sample data that mistake occurs are as follows:
Wherein, h indicates the checkout coefficient.
Fig. 7 be another embodiment of the present invention provides a kind of data calibration device block diagram, as shown in fig. 7, the device is also
Include:
Identifier acquisition module 511, for obtaining described to be verified when check results are the default check field mistake
The Data Identification of data;
Searching module 512, for searching the corresponding initial data of the Data Identification from the first database;
Correction module 513, for according to the initial data to the default check fields of the data to be verified into
Row amendment.
The embodiment of the present invention also provides a kind of electronic equipment, as shown in figure 8, electronic equipment may include: processor 1501,
Communication interface 1502, memory 1503 and communication bus 1504, wherein processor 1501, communication interface 1502, memory 1503
Mutual communication is completed by communication bus 1504.
Memory 1503, for storing computer program;
Processor 1501 when for executing the computer program stored on memory 1503, realizes that the above method is implemented
The step of example.
The communication bus that above-mentioned electronic equipment is mentioned can be Peripheral Component Interconnect standard (Peripheral Component
Interconnect, P C I) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control bus etc..For just
It is only indicated with a thick line in expression, figure, it is not intended that an only bus or a type of bus.
Communication interface is for the communication between above-mentioned electronic equipment and other equipment.
Memory may include random access memory (Random Access Memory, RAM), also may include non-easy
The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be storage device that at least one is located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit (Central Processing Unit,
CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal
Processing, DSP), it is specific integrated circuit (Application Specific Integrated Circuit, ASIC), existing
It is field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete
Door or transistor logic, discrete hardware components.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer program
The step of above method embodiment is realized when being executed by processor.
It should be noted that for above-mentioned apparatus, electronic equipment and computer readable storage medium embodiment, due to
It is substantially similar to embodiment of the method, so being described relatively simple, related place is referring to the part explanation of embodiment of the method
It can.
Explanation is needed further exist for, herein, the relational terms of such as " first " and " second " or the like are only used
Distinguish one entity or operation from another entity or operation, without necessarily requiring or implying these entities or
There are any actual relationship or orders between operation.Moreover, the terms "include", "comprise" or its any other change
Body is intended to non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wrapped
Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article
Or the element that equipment is intrinsic.In the absence of more restrictions, the element limited by sentence "including a ...", and
It is not excluded in process, method, article or equipment in the process, method, article or apparatus that includes the element that there is also other identical elements.
The above is only a specific embodiment of the invention, is made skilled artisans appreciate that or realizing this hair
It is bright.Various modifications to these embodiments will be apparent to one skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and applied principle and features of novelty phase one herein
The widest scope of cause.
Claims (11)
1. a kind of data verification method characterized by comprising
Default check field is obtained, the probability that the default check field mistake causes data that mistake occurs meets preset condition;
It obtains in the second database from the data to be verified that first database is synchronous;
The default check field in the data to be verified is verified.
2. the method according to claim 1, wherein the method is also wrapped before obtaining default check field
It includes:
The quantity for obtaining sample data is the first numerical value;
Error sample data are extracted from the sample data, the error sample data are that the sample data of mistake, institute occurs
The quantity for stating error sample data is second value;
The error sample data are analyzed, determine the field that mistake occurs;
The quantity for counting the corresponding error sample data of the field is third value;
Calculating the field according to first numerical value, second value and third value causes sample data that the general of mistake occurs
Rate;
Judge whether the probability meets the preset condition, the probability meets the preset condition and comprises at least one of the following
Situation: the probability is greater than or equal to predetermined probabilities threshold value, and after the probability is sorted from large to small, preceding predetermined number
Probability;
The corresponding field of the probability for meeting the preset condition is set as the default check field.
3. according to the method described in claim 2, it is characterized in that, according to first numerical value, second value and third value
Calculate the probability that the field causes sample data that mistake occurs, comprising:
Error sample data the first ratio shared in all sample datas is calculated according to first numerical value and second value;
The corresponding error sample data of the field are calculated in all error sample numbers according to the second value and third value
The second shared ratio in;
The corresponding error sample data of the field are calculated in all sample datas according to first numerical value and third value
Shared third ratio;
Based on NB Algorithm is improved, the probability is calculated according to first ratio, the second ratio and third ratio are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate first ratio, P
(a) expression second ratio, and P (B | a) indicate the third ratio.
4. according to the method described in claim 3, it is characterized in that, according to first numerical value, second value and third value
Calculate the probability that the field causes sample data that mistake occurs, further includes:
Obtain the corresponding checkout coefficient of the field;
Calculating the field according to first numerical value, second value, third value and the checkout coefficient leads to sample data
The probability of mistake occurs are as follows:
Wherein, h indicates the checkout coefficient.
5. the method according to claim 1, wherein the method also includes:
When check results are the default check field mistake, the Data Identification of the data to be verified is obtained;
The corresponding initial data of the Data Identification is searched from the first database;
It is modified according to the default check field of the initial data to the data to be verified.
6. a kind of data calibration device characterized by comprising
Field obtains module, and for obtaining default check field, the default check field mistake causes data that mistake occurs
Probability meets preset condition;
Data acquisition module, for obtaining in the second database from the data to be verified that first database is synchronous;
Correction verification module, for being verified to the default check field in the data to be verified.
7. device according to claim 6, which is characterized in that described device further include:
Quantity obtains module, and the quantity for before obtaining default check field, obtaining sample data is the first numerical value;
Extraction module, for extracting error sample data from the sample data, the error sample data are that mistake occurs
Sample data, the quantity of the error sample data is second value;
Analysis module determines the field that mistake occurs for analyzing the error sample data;
Statistical module, the quantity for counting the corresponding error sample data of the field are third value;
Computing module leads to sample data for calculating the field according to first numerical value, second value and third value
The probability of mistake occurs;
Judgment module, for judging whether the probability meets the preset condition, the probability meets the preset condition packet
Include following at least one situation: the probability is greater than or equal to predetermined probabilities threshold value, and the probability is sorted from large to small
Afterwards, the probability of preceding predetermined number;
Setting module, for the corresponding field of the probability for meeting the preset condition to be set as the default check field.
8. device according to claim 7, which is characterized in that the computing module is specifically used for according to first number
Value and second value calculate error sample data the first ratio shared in all sample datas;According to the second value and
Third value calculates the corresponding error sample data of the field the second ratio shared in all error sample data;According to
First numerical value and third value calculate the corresponding error sample data of the field shared the in all sample datas
Three ratios;Based on NB Algorithm is improved, the probability is calculated according to first ratio, the second ratio and third ratio
Are as follows:
Wherein, P (a | B) indicates that the probability, P (B) indicate first ratio, P
(a) expression second ratio, and P (B | a) indicate the third ratio.
9. device according to claim 8, which is characterized in that it is corresponding to be also used to obtain the field for the computing module
Checkout coefficient;Calculating the field according to first numerical value, second value, third value and the checkout coefficient leads to sample
The probability of mistake occurs for notebook data are as follows:
Wherein, h indicates the checkout coefficient.
10. device according to claim 6, which is characterized in that described device further include:
Identifier acquisition module, for obtaining the data to be verified when check results are the default check field mistake
Data Identification;
Searching module, for searching the corresponding initial data of the Data Identification from the first database;
Correction module, for being modified according to the default check field of the initial data to the data to be verified.
11. a kind of electronic equipment characterized by comprising processor, communication interface, memory and communication bus, wherein place
Device, communication interface are managed, memory completes mutual communication by communication bus;
The memory, for storing computer program;
The processor when for executing the computer program, realizes the described in any item method and steps of claim 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910684893.9A CN110399428B (en) | 2019-07-26 | 2019-07-26 | Data verification method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910684893.9A CN110399428B (en) | 2019-07-26 | 2019-07-26 | Data verification method and device and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110399428A true CN110399428A (en) | 2019-11-01 |
CN110399428B CN110399428B (en) | 2022-02-11 |
Family
ID=68326253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910684893.9A Active CN110399428B (en) | 2019-07-26 | 2019-07-26 | Data verification method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110399428B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112241328A (en) * | 2020-09-10 | 2021-01-19 | 长沙市到家悠享网络科技有限公司 | Data processing method, device and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100299314A1 (en) * | 2005-03-18 | 2010-11-25 | Arijit Sengupta | Identifying and using critical fields in quality management |
US20130279516A1 (en) * | 2010-12-17 | 2013-10-24 | Zte Corporation | Method and Device for Improving Robustness of Context Update Message in Robust Header Compression |
CN106802898A (en) * | 2015-11-26 | 2017-06-06 | 北大方正集团有限公司 | Data entry method and device |
CN109635300A (en) * | 2018-12-14 | 2019-04-16 | 泰康保险集团股份有限公司 | Data verification method and device |
-
2019
- 2019-07-26 CN CN201910684893.9A patent/CN110399428B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100299314A1 (en) * | 2005-03-18 | 2010-11-25 | Arijit Sengupta | Identifying and using critical fields in quality management |
US20130279516A1 (en) * | 2010-12-17 | 2013-10-24 | Zte Corporation | Method and Device for Improving Robustness of Context Update Message in Robust Header Compression |
CN106802898A (en) * | 2015-11-26 | 2017-06-06 | 北大方正集团有限公司 | Data entry method and device |
CN109635300A (en) * | 2018-12-14 | 2019-04-16 | 泰康保险集团股份有限公司 | Data verification method and device |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112241328A (en) * | 2020-09-10 | 2021-01-19 | 长沙市到家悠享网络科技有限公司 | Data processing method, device and system |
CN112241328B (en) * | 2020-09-10 | 2024-01-23 | 长沙市到家悠享网络科技有限公司 | Data processing method, device and system |
Also Published As
Publication number | Publication date |
---|---|
CN110399428B (en) | 2022-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI603220B (en) | Method and device for network verification information | |
US8117609B2 (en) | System and method for optimizing changes of data sets | |
WO2017113677A1 (en) | User behavior data processing method and system | |
US11372699B1 (en) | Method and system for detecting system outages using application event logs | |
CN106709805B (en) | User income data acquisition method and system | |
CN109299193A (en) | Method of data synchronization and relevant device | |
EP4327210A1 (en) | Systems and methods for predicting correct or missing data and data anomalies | |
US20110282813A1 (en) | System and method for using pattern recognition to monitor and maintain status quo | |
US20190050672A1 (en) | INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS | |
CN110399428A (en) | A kind of data verification method, device and electronic equipment | |
WO2019056501A1 (en) | Personalized wifi hotspot pushing method, device, and storage medium | |
CN106304084B (en) | Information processing method and device | |
CN112181794A (en) | Page monitoring method and device, computer equipment and storage medium | |
CN113535449B (en) | Abnormal event restoration processing method and device, computer equipment and storage medium | |
CN109783721A (en) | A kind of intelligence questionnaire method for pushing and system | |
US10803053B2 (en) | Automatic selection of neighbor lists to be incrementally updated | |
CN106604072B (en) | The difference analysis method and device of Web TV data | |
CN114138813A (en) | Attribute configuration method and related device | |
CN113691548A (en) | Data acquisition and classified storage method and system thereof | |
CN113360172A (en) | Application deployment method and device, computer equipment and storage medium | |
CN106933694A (en) | Application error localization method and device | |
CN111736939A (en) | Page self-adaptive adjusting method and device, storage medium and computer equipment | |
CN111371900B (en) | Method and system for monitoring health state of synchronous link | |
US11928622B2 (en) | Systems and methods for failure detection tools in large scale maintenance operations | |
CN110349025B (en) | Method and device for preventing loss of contract assets based on non-cost transaction output |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |