CN108280096A - Data cleaning method and data cleansing device - Google Patents

Data cleaning method and data cleansing device Download PDF

Info

Publication number
CN108280096A
CN108280096A CN201710011044.8A CN201710011044A CN108280096A CN 108280096 A CN108280096 A CN 108280096A CN 201710011044 A CN201710011044 A CN 201710011044A CN 108280096 A CN108280096 A CN 108280096A
Authority
CN
China
Prior art keywords
sample data
data
sample
raw
screening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710011044.8A
Other languages
Chinese (zh)
Inventor
赵强
杨敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710011044.8A priority Critical patent/CN108280096A/en
Publication of CN108280096A publication Critical patent/CN108280096A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of data cleaning method and data cleansing device, which includes:Obtain raw sample data to be cleaned;It determines at least one data screening mechanism cleaned to the raw sample data, and obtains the screening value that user sets data screening mechanism described in each according to the raw sample data;The raw sample data is screened according at least one data screening mechanism and the screening value set by user, to be cleaned to the raw sample data.Technical scheme of the present invention can realize comprehensive cleaning to raw sample data, and can reduce dependence of the data cleansing process to operating personnel, it is ensured that the accuracy and stability of data cleansing result, while can also effectively shorten the duration of data cleansing.

Description

Data cleaning method and data cleansing device
Technical field
The present invention relates to technical field of data processing, are filled in particular to a kind of data cleaning method and data cleansing It sets.
Background technology
It in the quantitative study of user and the processing procedure of light weight level data, is both needed to start the cleaning processing data, to pick Except abnormal data, ensure the reliability and validity of data result.Currently, because of the variability of investigational data and light weight level data, logarithm According to the mode manually cleaned often is taken, lack unification, standard cleaning process, the mode manually cleaned is primarily present following ask Topic:
1, time-consuming for data cleansing, and the mode manually cleaned carries out data judgement dependent on operating personnel, and is needed after judging It to complete to clean step by step, need the plenty of time;
2, data cleansing is susceptible to omission, and operating personnel can be because certain conditions be omitted when carrying out mass data operation And part sample is caused not to be cleaned;
3, the result of data cleansing unstable result, data cleansing can occur wash result because of the difference of operating personnel not Consistent problem;
4, data cleansing process can not be recalled, and can not be returned when there is cleaning error and look into amendment;
5, data cleansing result is verified time-consuming and laborious, needs to count data again after the completion of cleaning, and it is clear to verify data Wash result.
Therefore a kind of new data cleansing scheme is needed to clean data.
It should be noted that information is only used for reinforcing the reason of the background to the present invention disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Invention content
The purpose of the present invention is to provide a kind of data cleaning method and data cleansing devices, and then at least to a certain degree On overcome the problems, such as caused by the limitation and defect of the relevant technologies one or more.
Other characteristics and advantages of the present invention will be apparent from by the following detailed description, or partially by the present invention Practice and acquistion.
According to an aspect of the present invention, a kind of data cleaning method is provided, including:
Obtain raw sample data to be cleaned;
It determines at least one data screening mechanism cleaned to the raw sample data, and obtains user according to institute State the screening value that raw sample data sets data screening mechanism described in each;
According at least one data screening mechanism and the screening value set by user to the raw sample data It is screened, to be cleaned to the raw sample data.
Include sample rejecting machine at least one data screening mechanism in a kind of exemplary embodiment of the present invention In the case that system and the screening value include target sample feature, the step of screening to the raw sample data, packet It includes:
The raw sample data is analyzed, to obtain at least one of raw sample data sample characteristics The sample data corresponded to each sample characteristics;
Using sample data corresponding with the target sample feature as the sample data filtered out, and delete the original Other sample datas in beginning sample data.
Include rating matrix sieve at least one data screening mechanism in a kind of exemplary embodiment of the present invention In the case that choosing and the screening value include the start-stop position of rating matrix topic, the raw sample data is screened The step of, including:
For any sample data in the raw sample data, the rating matrix in any sample data is calculated The answer number of topic;
Judge the total number for the rating matrix topic whether the answer number is equal in any sample data;
If the answer number is equal to the total number, the side of the corresponding rating matrix of any sample data is calculated Difference, and determined whether any sample data from described according to the variance of the corresponding rating matrix of any sample data It is deleted in raw sample data;
If the answer number is not equal to the total number, by any sample data from the raw sample data It deletes.
In a kind of exemplary embodiment of the present invention, according to the variance of the corresponding rating matrix of any sample data The step of determining whether to delete any sample data from the raw sample data, including:
If the variance of the corresponding rating matrix of any sample data is 0, by any sample data from described It is deleted in raw sample data;
If the variance of the corresponding rating matrix of any sample data is not 0, protected in the raw sample data Stay any sample data.
Include time sieve of answering at least one data screening mechanism in a kind of exemplary embodiment of the present invention Choosing and the screening value include answering in the case of time storage location, the step of screening to the raw sample data, Including:
For any sample data in the raw sample data, answered described in the acquisition of time storage location according to described Any sample data is answered the time;
Judge the time of answering of any sample data it is whether corresponding with any sample data answer standard when Between match;
If answer time and the standard time mismatch of answering of any sample data, by any sample Data are deleted from the raw sample data;
If the time of answering of any sample data matches with the standard time of answering, in the original sample Retain any sample data in data.
In a kind of exemplary embodiment of the present invention, the data cleaning method further includes:
After getting the raw sample data, by the identical sample data of answer number in the raw sample data It is classified as same group, to obtain at least one set of sample data;
For any group of sample data at least one set of sample data, the flat of any group of sample data is calculated It answers the standard deviation of time and the time of answering of any group of sample data;
According to any group of sample data be averaged answer the time, any group of sample data time of answering mark Quasi- difference is answered the time with each sample data in any group of sample data, and it is corresponding to calculate each sample data It answers the standard time.
In a kind of exemplary embodiment of the present invention, calculated according to following formula every in any group of sample data A sample data is corresponding to answer the standard time:
Wherein, Z expressions each sample data is corresponding answers the standard time, and x indicates each sample data It answers the time,Indicate that being averaged the time of answering for any group of sample data, δ indicate answering for any group of sample data The standard deviation of time.
Include that logic redirects sieve at least one data screening mechanism in a kind of exemplary embodiment of the present invention In the case that choosing and the screening value include the start-stop topic that logic redirects, the step of screening to the raw sample data, Including:
For any sample data in the raw sample data, is inscribed according to the start-stop that the logic redirects, calculate institute State the sum of data of answering between the start-stop topic that the logic of any sample data redirects;
If the data of answering between the start-stop topic that the logic of any sample data redirects are not 0, will be described any Sample data is deleted from the raw sample data;
If the data of answering between the start-stop topic that the logic of any sample data redirects are 0, in the original sample Retain any sample data in notebook data.
Include regular logical sieve at least one data screening mechanism in a kind of exemplary embodiment of the present invention In the case that choosing and the screening value include the logical criteria value of topic and setting that user selectes, to the original sample number According to the step of being screened, including:
For any sample data in the raw sample data, the corresponding regular logical of the selected topic is judged Whether match with the logical criteria value;
If the corresponding regular logical of selected topic and the logical criteria value mismatch, by any sample Data are deleted from the raw sample data;
If the corresponding regular logical of the selected topic matches with the logical criteria value, in the original sample Retain any sample data in data.
In a kind of exemplary embodiment of the present invention, the data cleaning method further includes:It will be from the original sample The sample data deleted in notebook data is backed up.
In a kind of exemplary embodiment of the present invention, the data cleaning method further includes:
After being completed to raw sample data cleaning, per pass topic answers in the clean sample data counted Inscribe situation;
Statistical graph is generated according to the answer situation of per pass topic in the clean sample data counted on, and shows institute State statistical graph.
According to another aspect of the invention, it is proposed that a kind of data cleansing device includes:
Acquiring unit, for obtaining raw sample data to be cleaned;
Determination unit, for determining at least one data screening mechanism and use cleaned to the raw sample data Screening value of the family for each data screening mechanism setting;
Processing unit is used for according at least one data screening mechanism and the screening value set by user to described Raw sample data is screened, to be cleaned to the raw sample data.
In the technical solution that some embodiments of the present invention are provided, raw sample data is cleaned by determination At least one data screening mechanism, and obtain the screening that user is arranged each data screening mechanism according to raw sample data Value, to be screened to raw sample data based on at least one data screening mechanism and screening value set by user, is made It obtains when being screened to data, data screening mechanism can be integrated, and then can realize to raw sample data Cleaning comprehensively, avoid data cleansing process Conditions omit and the problem that causes part sample data not to be cleaned.Meanwhile Since data cleansing device can automatically be realized according to determining at least one data screening mechanism and screening value set by user Raw sample data is cleaned, therefore reduces dependence of the data cleansing process to operating personnel, it is ensured that data cleansing knot The Stability and veracity of fruit, and can also effectively shorten data cleansing duration.
In the technical solution that some embodiments of the present invention are provided, pass through the sample that will be deleted from raw sample data Notebook data is backed up so that can be carried out back looking into amendment in time when data cleansing is made a fault, be ensured data cleansing process It can be recalled.
In addition, in the technical solution that some embodiments of the present invention are provided, by being cleaned to raw sample data After completion, the answer situation of per pass topic in the clean sample data counted, to generate statistical chart according to statistical result Table so that user faster can more intuitively check data cleansing result.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not It can the limitation present invention.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the present invention Example, and be used to explain the principle of the present invention together with specification.It should be evident that the accompanying drawings in the following description is only the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 diagrammatically illustrates the flow chart of data cleaning method according to first embodiment of the invention;
Fig. 2 diagrammatically illustrates the flow chart of the data cleaning method of second embodiment according to the present invention;
Fig. 3 diagrammatically illustrates the process chart that sample according to an embodiment of the invention rejects mechanism;
Fig. 4 diagrammatically illustrates the process chart of rating matrix screening according to an embodiment of the invention;
Fig. 5 diagrammatically illustrates the process chart of time screening according to an embodiment of the invention of answering;
Fig. 6 diagrammatically illustrates the process chart that logic according to an embodiment of the invention redirects screening;
Fig. 7 diagrammatically illustrates the block diagram of data cleansing device according to an embodiment of the invention.
Specific implementation mode
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the present invention will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to fully understand the embodiment of the present invention to provide.However, It will be appreciated by persons skilled in the art that technical scheme of the present invention can be put into practice without one or more in specific detail, Or other methods, constituent element, device, step may be used etc..In other cases, it is not shown in detail or describes known side Method, device, realization or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in attached drawing is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in attached drawing is merely illustrative, it is not necessary to including all content and operation/step, It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or part merge, therefore the sequence actually executed is possible to be changed according to actual conditions.
Fig. 1 diagrammatically illustrates the flow chart of data cleaning method according to first embodiment of the invention.
Specifically, as shown in Figure 1, in step s 102, obtaining raw sample data to be cleaned.
According to example embodiment, it can be the original sample number for obtaining user and uploading to obtain raw sample data to be cleaned According to, or the storage location specified according to user obtains raw sample data to be cleaned.
In step S104, at least one data screening mechanism cleaned to the raw sample data is determined, and Obtain the screening value that user sets data screening mechanism described in each according to the raw sample data.
According to example embodiment, determine that at least one data screening mechanism cleaned to raw sample data can be It determines according to the user's choice.Specifically, all data screening mechanism can be all presented and (is such as shown by display screen Show) it is selected to user, determine at least one cleaned to raw sample data to operate according to the user's choice Data screening mechanism.After the determining at least one data screening mechanism cleaned to raw sample data, it can obtain User is directed to the screening value of each data screening mechanism setting.Due to user it is known that raw sample data to be cleaned, because This can set screening value according to raw sample data to each data screening mechanism.
In step s 106, according at least one data screening mechanism and the screening value set by user to described Raw sample data is screened, to be cleaned to the raw sample data.
According to example embodiment, according at least one data screening mechanism and screening value set by user come to original sample When notebook data is screened, which may be performed simultaneously, and can also execute in a predetermined sequence.
Below by taking different data screening mechanism and screening value set by user as an example, how it is described in detail to original sample Data are screened:
Filtering system one:
According to example embodiment of the present invention, at least one data screening mechanism include sample reject mechanism and In the case that the screening value includes target sample feature, the step of screening to the raw sample data, including:
The raw sample data is analyzed, to obtain at least one of raw sample data sample characteristics The sample data corresponded to each sample characteristics;
Using sample data corresponding with the target sample feature as the sample data filtered out, and delete the original Other sample datas in beginning sample data.
It should be noted that:The purpose that sample rejects mechanism is to wash not meeting the sample data that investigation requires, specifically It is to delete the sample data for not meeting target sample feature from raw sample data.
Filtering system two:
According to example embodiment of the present invention, at least one data screening mechanism include rating matrix screening and In the case that the screening value includes the start-stop position of rating matrix topic, step that the raw sample data is screened Suddenly, including:
For any sample data in the raw sample data, the rating matrix in any sample data is calculated The answer number of topic;
Judge the total number for the rating matrix topic whether the answer number is equal in any sample data;
If the answer number is equal to the total number, the side of the corresponding rating matrix of any sample data is calculated Difference, and determined whether any sample data from described according to the variance of the corresponding rating matrix of any sample data It is deleted in raw sample data;
If the answer number is not equal to the total number, by any sample data from the raw sample data It deletes.
It should be noted that:The purpose of rating matrix screening is to wash the sample data that leakage is answered, disorderly answered in rating matrix. Wherein, if the answer number of rating matrix topic and the total number of rating matrix topic differ in some sample data, illustrate Rating matrix topic in the sample data is likely to occur the problem of leakage is answered, it is therefore desirable to delete the sample data.
It according to example embodiment, will be described any if the variance of the corresponding rating matrix of any sample data is 0 Sample data is deleted from the raw sample data;If the variance of the corresponding rating matrix of any sample data is not 0, Then retain any sample data in the raw sample data.
It should be noted that if the variance of the corresponding rating matrix of any sample data is 0, then illustrate in the sample data The data of answering of rating matrix topic are all identical, this may be caused by disorderly answering, it is therefore desirable to by rating matrix The sample data that variance is 0 is deleted from raw sample data.
Filtering system three:
According to example embodiment of the present invention, at least one data screening mechanism include answer time screening and In the case that the screening value is including time storage location of answering, the step of screening to the raw sample data, including:
For any sample data in the raw sample data, answered described in the acquisition of time storage location according to described Any sample data is answered the time;
Judge the time of answering of any sample data it is whether corresponding with any sample data answer standard when Between match;
If answer time and the standard time mismatch of answering of any sample data, by any sample Data are deleted from the raw sample data;
If the time of answering of any sample data matches with the standard time of answering, in the original sample Retain any sample data in data.
It should be noted that:The purpose of time screening of answering is to wash time of answering too short and long sample number According to.
Wherein it is determined that the scheme of any sample data corresponding standard time of answering is as follows:
After getting the raw sample data, by the identical sample data of answer number in the raw sample data It is classified as same group, to obtain at least one set of sample data;
For any group of sample data at least one set of sample data, the flat of any group of sample data is calculated It answers the standard deviation of time and the time of answering of any group of sample data;
According to any group of sample data be averaged answer the time, any group of sample data time of answering mark Quasi- difference is answered the time with each sample data in any group of sample data, and it is corresponding to calculate each sample data It answers the standard time.
According to example embodiment of the present invention, each of described any group of sample data can be calculated according to following formula Sample data is corresponding to answer the standard time:
Wherein, Z expressions each sample data is corresponding answers the standard time, and x indicates each sample data It answers the time,Indicate that being averaged the time of answering for any group of sample data, δ indicate answering for any group of sample data The standard deviation of time.
Filtering system four:
According to example embodiment of the present invention, at least one data screening mechanism include logic redirect screening and In the case that the screening value includes the start-stop topic that logic redirects, the step of screening to the raw sample data, including:
For any sample data in the raw sample data, is inscribed according to the start-stop that the logic redirects, calculate institute State the sum of data of answering between the start-stop topic that the logic of any sample data redirects;
If the data of answering between the start-stop topic that the logic of any sample data redirects are not 0, will be described any Sample data is deleted from the raw sample data;
If the data of answering between the start-stop topic that the logic of any sample data redirects are 0, in the original sample Retain any sample data in notebook data.
It should be noted that:If logic redirects normally, the data of answering between the start-stop topic that logic redirects should be 0, If the data of answering between the start-stop topic that therefore logic of any sample data redirects not are 0, illustrate patrolling for the sample data It collects and redirects exception, which can be deleted from raw sample data.
Filtering system five:
According to example embodiment of the present invention, at least one data screening mechanism include regular logical screening and In the case that the screening value includes the logical criteria value of topic and setting that user selectes, the raw sample data is carried out The step of screening, including:
For any sample data in the raw sample data, the corresponding regular logical of the selected topic is judged Whether match with the logical criteria value;
If the corresponding regular logical of selected topic and the logical criteria value mismatch, by any sample Data are deleted from the raw sample data;
If the corresponding regular logical of the selected topic matches with the logical criteria value, in the original sample Retain any sample data in data.
On the basis of above-mentioned data cleaning method, in order to carry out back looking into time when data cleansing is made a fault Correct, ensure data cleansing process can be recalled, can by the sample data deleted from the raw sample data into Row backup.
In addition, according to example embodiment of the present invention, in order to enable user faster can more intuitively check data cleansing As a result, can be after being completed to raw sample data cleaning, the answer of per pass topic in the clean sample data counted Situation, and statistical graph is generated according to the answer situation of per pass topic in the clean sample data counted on, and show the statistics Chart.
Fig. 2 diagrammatically illustrates the flow chart of the data cleaning method of second embodiment according to the present invention.
With reference to Fig. 2, raw sample data is read in step S20.The raw sample data that can be such as specified according to user Storage location read raw sample data, or the raw sample data of user's upload can also be directly read.
The selection of Filtering system is carried out in step S22.It should be noted that:Can by various Filtering system procedures, Line programization of going forward side by side encapsulates, and when carrying out data cleansing, user can select to need Filtering system to be used.Certainly, in this hair In some bright embodiments, directly raw sample data can also be cleaned and nothing using the data screening mechanism of acquiescence User is needed to select.
Data cleansing is carried out by the Filtering system of selection in step s 24.Wherein, Filtering system include it is following any or Multiple combinations:Sample reject mechanism, rating matrix Filtering system, answer time Filtering system, logic redirect Filtering system and Regular logical Filtering system.
Clean data after output is screened in step S26.
The flow of each data screening mechanism described further below:
1, sample rejects mechanism.
The purpose that sample rejects mechanism is the sample data that cleaning does not meet that investigation requires.It specifically, can be to original sample Notebook data carries out labeling, and then according to investigation purpose, the sample populations for meeting analysis target are filtered out in sample data.Number It can be based on crowd's ratio of available data, from the apparent sample populations of extracting data feature, Jin Erneng according to the standard of screening It is more preferable to solve the problems, such as existing investigational data screening faster, raw sample data is screened from data itself.
According to example embodiment of the present invention, such as when doing user's investigation, if the raw sample data read is the whole network The investigational data of user, and the user that target platform is directed to when doing data analysis analyze and optimize, it cannot be by whole numbers It is analyzed according to being included in, therefore the first step that target group's inspection is investigation and analysis is carried out to data collection.It can specifically be directed to original Sample data calculates the accounting of each layer data, and then carries out tag definition to sample characteristics, is finally screened.
Detailed process includes the following steps with reference to Fig. 3:
Step S302 reads sample data.
Step S304 sets screening value, i.e., is set up for sample rejecting machine and set screening value.
Step S306 carries out sample data judgement.
Step S308, whether judgement sample data match with screening value, if so, thening follow the steps S310;Otherwise, step is executed Rapid S312.
Step S310 reads lower a data, and executes step S306 and continue to judge.
Step S312 deletes sample data when sample data is matched with screening value.
Step S314, statistics delete the number of data.
Step S316, when the number for deleting data is exceeded, the data that undelete.
It should be noted that after deleting data in the inventive solutions, need to carry out the data of deletion standby Part, when data cleansing occurs abnormal, data recovery can be carried out.When the sample data of deletion is excessive, remaining sample number According to cannot meet investigation demand, therefore can be restored when the sample data of deletion is excessive by step S314 and step S316 The data of deletion, in order to re-start data screening.
2, rating matrix screens.
The purpose of rating matrix screening is to clean in rating matrix topic to leak the sample data answered, disorderly answered.Main flow is:It is first First calculate answer number of the sample data in rating matrix, secondly judge whether answer number complete, when answer completely after calculate sample The variance of the corresponding rating matrix of notebook data, the variance finally by the corresponding rating matrix of sample data carry out judgement sample data Whether it is random answer evidence.
That is interval scale feature of the rating matrix Filtering system according to rating matrix topic, by the topic item continuously evaluated into Row is sorted out, and the automatic variance yields for calculating continuous topic item finally rejects the sample data that variance is 0.
Detailed process includes the following steps with reference to Fig. 4:
Step S402 reads the start-stop topic address of rating matrix.
Step S404 calculates the topic sum between the start-stop topic of rating matrix.
Step S406, judges whether the answer number of rating matrix is equal to topic sum, if so, thening follow the steps S408;It is no Then, step S412 is executed.
Step S408 calculates the variance of rating matrix.
Step S410 judges whether the variance of rating matrix is 0, if so, thening follow the steps S412.
Step S412 deletes data.
Step S414, statistics delete the number of data.
Step S416, when the number for deleting data is exceeded, the data that undelete.
It should be noted that after deleting data in the inventive solutions, need to carry out the data of deletion standby Part, when data cleansing occurs abnormal, data recovery can be carried out.When the sample data of deletion is excessive, remaining sample number According to cannot meet investigation demand, therefore can be restored when the sample data of deletion is excessive by step S414 and step S416 The data of deletion, in order to re-start data screening.
3, answer the time screening.
The purpose that the time screens of answering is to delete the too short and long sample data of Reaction time.Main flow is:First Set the time row of data, i.e., the storage location of specified time of answering;Secondly setting Reaction time standard;Finally by by sample The time of answering of data is compared to screen sample data with Reaction time standard.
Specifically, after reading sample data, the statistics of answer number is first carried out to the sample being collected into, is answered identical The sample data of topic number is classified as one group, obtains at least one set of sample data.Then it is directed in this at least one set of sample data Every group of sample data, calculating is averagely answered the time, and calculates the standard deviation of the time of answering of every group of sample data, and then according to public affairs FormulaCalculate the Reaction time standard scores of each sample data in every group of sample data.3 δ methods of last foundation Then, sample data of the standard scores except positive and negative 3 is deleted.The data screening mechanism is adopted on the basis of data divide group The sample not between positive and negative 3 in sample data is deleted with standard scores.
Detailed process includes the following steps with reference to Fig. 5:
Step S502, read access time row determine that sample data is answered the storage location of time.
Step S504, setting time standard.
Step S506 reads a sample data.
Step S508, judges whether the time of answering of the sample data read matches with the time standard of setting, if so, Then follow the steps S510;Otherwise, step S512 is executed.
Step S510 retains sample data.
Step S512 deletes data.
It should be noted that after deleting data in the inventive solutions, need to carry out the data of deletion standby Part, when data cleansing occurs abnormal, data recovery can be carried out.
4, logic redirects screening.
The purpose that logic redirects screening is to delete the sample data not redirected according to logic is redirected in Questionnaire systems. Main flow is:The start-stop topic that setting logic redirects first;Secondly calculating logic redirects the sum of data of answering between start-stop topic; Whether the sum of data judgement sample of answering between being inscribed finally by start-stop is that logic redirects wrong sample.
The data screening logic extracts the logic that redirects of investigational data from the logic turn of investigational data, then right The option that redirects during logic redirects carries out data inspection one by one, is deleted to redirecting the sample that there are data of answering between topic It removes.This programme avoids multiple logistic diagnosis process, while the work of data analysis and data cleansing being detached, directly Logic judgment is carried out to investigational data, then carries out sample deletion according to Data Representation.
Detailed process includes the following steps with reference to Fig. 6:
Step S602 reads logic start-stop topic.
Step S604 reads logic and redirects topic.
Step S606, judges whether starting topic needs to continue answer later, if so, thening follow the steps S608;Otherwise, it executes Step S614.
Step S608, setting continue answer number.
Step S610 calculates new starting topic number.
Step S612 judges whether the data of answering between new start-stop topic are 0, if so, thening follow the steps S616;It is no Then, step S618 is executed.
Step S614 judges whether the data of answering between start-stop topic are 0, if so, thening follow the steps S616;Otherwise, it holds Row step S618.
Step S616 retains sample data.
Step S618 deletes sample data.
It should be noted that after deleting data in the inventive solutions, need to carry out the data of deletion standby Part, when data cleansing occurs abnormal, data recovery can be carried out.When the sample data of deletion is excessive, remaining sample number According to investigation demand cannot be met, therefore can be when the sample data of deletion is excessive, the data to undelete, in order to again into Row data screening.
5, regular logical screens.
The purpose of regular logical screening is to delete the sample data for the logic that is not accordant to the old routine.Main flow is:It is selected first The regular logical topic judged;Secondly the standard value of regular logical is set;Between judging that regular logical is inscribed The gap of numerical value screens sample.
The logic Filtering system integrates daily basic logic, will be in sample data when analyzing sample data Regular logical carries out data extraction, and the basic logic in Compare System one by one, is void by the judgement for not meeting basic logic False data is deleted.Existing regular logical is carried out unified conclusion by the logic Filtering system, stores daily basic logic Judging pond, ensure comprehensive covering of daily basic logic, so as to avoid will appear the risk of omission in existing processing, ensureing number According to accuracy.In addition, when carrying out logic comparison, traversal is taken to check the daily logic in every data, it is ensured that final The data of retention are true and reliable.
For example, it when selected regular logical entitled age and educational background, if the age in a certain sample data is 15, learns It goes through as postgraduate, then can determine the logic that is not accordant to the old routine, therefore the sample data can be deleted.
To sum up, the embodiment of the present invention mainly provides a kind of data cleansing scheme of standard visible, by it is existing at Data cleansing mechanism is carried out normalization procedure encapsulation, at least realizes following technique effect by ripe computer language:
1, dependence of the data cleansing process to operating personnel is reduced, the duration of data cleansing is shortened.Reading original sample After notebook data, data screening logic can be selected, realize the data cleansing of many condition, improve data cleansing effect Rate;
2, data cleansing mechanism can be carried out to exhaustive classification, realize primary data sample and patrolled from General Logic to questionnaire The comprehensive screening collected avoids data cleansing and the problem of cleaning condition omission occurs in the process;
3, data cleansing mechanism is standardized, has unified data cleansing standard, be no longer dependent on the warp of operating personnel It tests to be set to cleaning standard, it is ensured that the standardization of data cleansing result and stabilisation;
4, each step data wash result can be cached in cleaning process, it is ensured that data can be recalled, and sieve Change can be returned in time after selecting condition setting mistake;
5, frequency statistics chart can be directly generated after data cleansing completion, and then faster can more intuitively checked The result of data screening.
Fig. 7 diagrammatically illustrates the block diagram of data cleansing device according to an embodiment of the invention.
Reference Fig. 7, data cleansing device 700 according to an embodiment of the invention, including:Acquiring unit 702, determination unit 704 and processing unit 706.
Specifically, acquiring unit 702 is for obtaining raw sample data to be cleaned;Determination unit 704 is for determining pair At least one data screening mechanism and user that the raw sample data is cleaned are directed to each described data screening mechanism The screening value of setting;Processing unit 706 is used for according at least one data screening mechanism and the screening set by user Value screens the raw sample data, to be cleaned to the raw sample data.
According to example embodiment, by determining at least one data screening mechanism cleaned to raw sample data, And the screening value that user is arranged each data screening mechanism according to raw sample data is obtained, to be based on at least one data Filtering system and screening value set by user screen raw sample data so that when being screened to data, energy It is enough to integrate data screening mechanism, and then can realize comprehensive cleaning to raw sample data, avoid data cleansing Process Conditions are omitted and the problem that causes part sample data not to be cleaned.Simultaneously as data cleansing device can root It is realized automatically according to determining at least one data screening mechanism and screening value set by user and raw sample data is cleaned, Therefore dependence of the data cleansing process to operating personnel is reduced, it is ensured that the Stability and veracity of data cleansing result, and Also data cleansing duration can be effectively shortened.
Below by taking different data screening mechanism and screening value set by user as an example, processing unit 706 is described in detail such as What screens raw sample data:
Filtering system one:
According to example embodiment of the present invention, at least one data screening mechanism include sample reject mechanism and In the case that the screening value includes target sample feature, processing unit 706 is configured to:
The raw sample data is analyzed, to obtain at least one of raw sample data sample characteristics The sample data corresponded to each sample characteristics;
Using sample data corresponding with the target sample feature as the sample data filtered out, and delete the original Other sample datas in beginning sample data.
Filtering system two:
According to example embodiment of the present invention, at least one data screening mechanism include rating matrix screening and In the case that the screening value includes the start-stop position of rating matrix topic, processing unit 706 is configured to:
For any sample data in the raw sample data, the rating matrix in any sample data is calculated The answer number of topic;
Judge the total number for the rating matrix topic whether the answer number is equal in any sample data;
If the answer number is equal to the total number, the side of the corresponding rating matrix of any sample data is calculated Difference, and determined whether any sample data from described according to the variance of the corresponding rating matrix of any sample data It is deleted in raw sample data;
If the answer number is not equal to the total number, by any sample data from the raw sample data It deletes.
According to example embodiment of the present invention, it is according to the variance determination of the corresponding rating matrix of any sample data It is no to delete any sample data from the raw sample data, including:
If the variance of the corresponding rating matrix of any sample data is 0, by any sample data from described It is deleted in raw sample data;
If the variance of the corresponding rating matrix of any sample data is not 0, protected in the raw sample data Stay any sample data.
It should be noted that:The purpose of rating matrix screening is to wash the sample data that leakage is answered, disorderly answered in rating matrix. Wherein, if the answer number of rating matrix topic and the total number of rating matrix topic differ in some sample data, illustrate Rating matrix topic in the sample data is likely to occur the problem of leakage is answered, it is therefore desirable to delete the sample data.If any The variance of the corresponding rating matrix of sample data is 0, then illustrates that the data of answering of rating matrix topic in the sample data are all Identical, this may be caused by disorderly answering, it is therefore desirable to by the variance of rating matrix for 0 sample data from original sample It is deleted in data.
Filtering system three:
According to example embodiment of the present invention, at least one data screening mechanism include answer time screening and The screening value includes in the case of answering time storage location, and processing unit 706 is configured to:
For any sample data in the raw sample data, answered described in the acquisition of time storage location according to described Any sample data is answered the time;
Judge the time of answering of any sample data it is whether corresponding with any sample data answer standard when Between match;
If answer time and the standard time mismatch of answering of any sample data, by any sample Data are deleted from the raw sample data;
If the time of answering of any sample data matches with the standard time of answering, in the original sample Retain any sample data in data.
It should be noted that:The purpose of time screening of answering is to wash time of answering too short and long sample number According to.
Wherein, determine any sample data it is corresponding answer the standard time when, processing unit 706 is configured to:
After getting the raw sample data, by the identical sample data of answer number in the raw sample data It is classified as same group, to obtain at least one set of sample data;
For any group of sample data at least one set of sample data, the flat of any group of sample data is calculated It answers the standard deviation of time and the time of answering of any group of sample data;
According to any group of sample data be averaged answer the time, any group of sample data time of answering mark Quasi- difference is answered the time with each sample data in any group of sample data, and it is corresponding to calculate each sample data It answers the standard time.
According to example embodiment of the present invention, each sample in any group of sample data is calculated according to following formula Data are corresponding to answer the standard time:
Wherein, Z expressions each sample data is corresponding answers the standard time, and x indicates each sample data It answers the time,Indicate that being averaged the time of answering for any group of sample data, δ indicate answering for any group of sample data The standard deviation of time.
Filtering system four:
According to example embodiment of the present invention, at least one data screening mechanism include logic redirect screening and In the case that the screening value includes the start-stop topic that logic redirects, processing unit 706 is configured to:
For any sample data in the raw sample data, is inscribed according to the start-stop that the logic redirects, calculate institute State the sum of data of answering between the start-stop topic that the logic of any sample data redirects;
If the data of answering between the start-stop topic that the logic of any sample data redirects are not 0, will be described any Sample data is deleted from the raw sample data;
If the data of answering between the start-stop topic that the logic of any sample data redirects are 0, in the original sample Retain any sample data in notebook data.
It should be noted that:If logic redirects normally, the data of answering between the start-stop topic that logic redirects should be 0, If the data of answering between the start-stop topic that therefore logic of any sample data redirects not are 0, illustrate patrolling for the sample data It collects and redirects exception, which can be deleted from raw sample data.
Filtering system five:
According to example embodiment of the present invention, at least one data screening mechanism include regular logical screening and In the case that the screening value includes the logical criteria value of topic and setting that user selectes, processing unit 706 is configured to:
For any sample data in the raw sample data, the corresponding regular logical of the selected topic is judged Whether match with the logical criteria value;
If the corresponding regular logical of selected topic and the logical criteria value mismatch, by any sample Data are deleted from the raw sample data;
If the corresponding regular logical of the selected topic matches with the logical criteria value, in the original sample Retain any sample data in data.
According to example embodiment of the present invention, data cleansing device is with acquiring unit 702 shown in fig. 7, determination On the basis of unit 704 and processing unit 706, can also include:Backup units, for will be deleted from the raw sample data The sample data removed is backed up.
According to example embodiment of the present invention, data cleansing device is with acquiring unit 702 shown in fig. 7, determination On the basis of unit 704 and processing unit 706, can also include:Statistic unit and display unit.
Specifically, statistic unit is used for after the processing unit 706 completes raw sample data cleaning, system Count the answer situation of per pass topic in obtained clean sample data;Described in display unit is used to be counted on according to statistic unit The answer situation of per pass topic generates statistical graph in clean sample data, and shows the statistical graph.
It should be noted that although being referred to several modules or list for acting the equipment executed in above-detailed Member, but this division is not enforceable.In fact, according to the embodiment of the present invention, it is above-described two or more The feature and function of module either unit can embody in a module or unit.Conversely, an above-described mould Either the feature and function of unit can be further divided into and embodied by multiple modules or unit block.
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the present invention The technical solution of embodiment can be expressed in the form of software products, the software product can be stored in one it is non-volatile Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, touch control terminal or network equipment etc.) is executed according to embodiment of the present invention Method.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the present invention Its embodiment.This application is intended to cover the present invention any variations, uses, or adaptations, these modifications, purposes or Person's adaptive change follows the general principle of the present invention and includes undocumented common knowledge in the art of the invention Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claim is pointed out.
It should be understood that the invention is not limited in the precision architectures for being described above and being shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.

Claims (12)

1. a kind of data cleaning method, which is characterized in that including:
Obtain raw sample data to be cleaned;
It determines at least one data screening mechanism cleaned to the raw sample data, and obtains user according to the original The screening value that beginning sample data sets data screening mechanism described in each;
The raw sample data is carried out according at least one data screening mechanism and the screening value set by user Screening, to be cleaned to the raw sample data.
2. data cleaning method according to claim 1, which is characterized in that at least one data screening mechanism packet It includes sample to reject in the case that mechanism and the screening value include target sample feature, the raw sample data is sieved The step of selecting, including:
The raw sample data is analyzed, to obtain at least one of raw sample data sample characteristics and every The sample data that a sample characteristics correspond to;
Using sample data corresponding with the target sample feature as the sample data filtered out, and delete the original sample Other sample datas in notebook data.
3. data cleaning method according to claim 1, which is characterized in that at least one data screening mechanism packet Include rating matrix screening and the screening value include rating matrix topic start-stop position in the case of, to the original sample The step of data are screened, including:
For any sample data in the raw sample data, the rating matrix topic in any sample data is calculated Answer number;
Judge the total number for the rating matrix topic whether the answer number is equal in any sample data;
If the answer number is equal to the total number, the variance of the corresponding rating matrix of any sample data is calculated, and Determined whether any sample data from described original according to the variance of the corresponding rating matrix of any sample data It is deleted in sample data;
If the answer number is not equal to the total number, any sample data is deleted from the raw sample data It removes.
4. data cleaning method according to claim 3, which is characterized in that commented according to any sample data is corresponding The variance of sub-matrix determines whether the step of deleting any sample data from the raw sample data, including:
If the variance of the corresponding rating matrix of any sample data is 0, by any sample data from described original It is deleted in sample data;
If the variance of the corresponding rating matrix of any sample data is not 0, institute is retained in the raw sample data State any sample data.
5. data cleaning method according to claim 1, which is characterized in that at least one data screening mechanism packet Include and answer time screening and the screening value includes answering in the case of time storage location, to the raw sample data into The step of row screening, including:
For any sample data in the raw sample data, obtained according to the time storage location of answering described any Sample data is answered the time;
Judge time of the answering standard time phase of answering whether corresponding with any sample data of any sample data Matching;
If answer time and the standard time mismatch of answering of any sample data, by any sample data It is deleted from the raw sample data;
If the time of answering of any sample data matches with the standard time of answering, in the raw sample data It is middle to retain any sample data.
6. data cleaning method according to claim 5, which is characterized in that further include:
After getting the raw sample data, the identical sample data of answer number in the raw sample data is classified as Same group, to obtain at least one set of sample data;
For any group of sample data at least one set of sample data, the average work of any group of sample data is calculated Answer the standard deviation of time and the time of answering of any group of sample data;
According to any group of sample data be averaged answer the time, any group of sample data time of answering standard deviation With answering the time for each sample data in any group of sample data, calculating each sample data is corresponding to answer Standard time.
7. data cleaning method according to claim 6, which is characterized in that calculate any group of sample according to following formula Each sample data in notebook data is corresponding to answer the standard time:
Wherein, Z expressions each sample data is corresponding answers the standard time, and x indicates answering for each sample data Time,Indicate that being averaged the time of answering for any group of sample data, δ indicate answering the time for any group of sample data Standard deviation.
8. data cleaning method according to claim 1, which is characterized in that at least one data screening mechanism packet Include logic redirect screening and the screening value include logic redirect start-stop topic in the case of, to the raw sample data into The step of row screening, including:
It for any sample data in the raw sample data, is inscribed according to the start-stop that the logic redirects, calculates described appoint The sum of data of answering between the start-stop topic that the logic of one sample data redirects;
If the data of answering between the start-stop topic that the logic of any sample data redirects not are 0, by any sample Data are deleted from the raw sample data;
If the data of answering between the start-stop topic that the logic of any sample data redirects are 0, in the original sample number Retain any sample data according to middle.
9. data cleaning method according to claim 1, which is characterized in that at least one data screening mechanism packet Include regular logical screening and the screening value include topic and setting that user selectes logical criteria value in the case of, to institute The step of raw sample data is screened is stated, including:
For any sample data in the raw sample data, whether the corresponding regular logical of the selected topic is judged Match with the logical criteria value;
If the corresponding regular logical of selected topic and the logical criteria value mismatch, by any sample data It is deleted from the raw sample data;
If the corresponding regular logical of the selected topic matches with the logical criteria value, in the raw sample data It is middle to retain any sample data.
10. the data cleaning method according to any one of claim 2 to 9, which is characterized in that further include:
The sample data deleted from the raw sample data is backed up.
11. data cleaning method according to any one of claim 1 to 9, which is characterized in that further include:
After being completed to raw sample data cleaning, the answer feelings of per pass topic in the clean sample data counted Condition;
Statistical graph is generated according to the answer situation of per pass topic in the clean sample data counted on, and shows the system Count chart.
12. a kind of data cleansing device, which is characterized in that including:
Acquiring unit, for obtaining raw sample data to be cleaned;
Determination unit, for determining at least one data screening mechanism cleaned to the raw sample data and user's needle The screening value that data screening mechanism described in each is set;
Processing unit is used for according at least one data screening mechanism and the screening value set by user to described original Sample data is screened, to be cleaned to the raw sample data.
CN201710011044.8A 2017-01-06 2017-01-06 Data cleaning method and data cleansing device Pending CN108280096A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710011044.8A CN108280096A (en) 2017-01-06 2017-01-06 Data cleaning method and data cleansing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710011044.8A CN108280096A (en) 2017-01-06 2017-01-06 Data cleaning method and data cleansing device

Publications (1)

Publication Number Publication Date
CN108280096A true CN108280096A (en) 2018-07-13

Family

ID=62801005

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710011044.8A Pending CN108280096A (en) 2017-01-06 2017-01-06 Data cleaning method and data cleansing device

Country Status (1)

Country Link
CN (1) CN108280096A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726199A (en) * 2018-12-28 2019-05-07 杭州铭智云教育科技有限公司 A kind of data cleaning method
CN110046151A (en) * 2019-03-05 2019-07-23 努比亚技术有限公司 A kind of data cleaning method, server and computer readable storage medium
CN110716512A (en) * 2019-09-02 2020-01-21 华电电力科学研究院有限公司 Environmental protection equipment performance prediction method based on coal-fired power plant operation data
CN110910231A (en) * 2019-11-06 2020-03-24 上海百事通信息技术股份有限公司 Debt clearing and collecting management platform
CN111427873A (en) * 2020-03-12 2020-07-17 无码科技(杭州)有限公司 Data cleaning method and system
CN115203180A (en) * 2022-05-16 2022-10-18 北京航空航天大学 Data blood relationship generation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256654A (en) * 2007-02-28 2008-09-03 陈国飞 Method for automatically inviting and tracing answer sheet of on-line questionnaire investigation system
CN102122294A (en) * 2011-01-29 2011-07-13 安徽工业大学 Survey research platform and method for psychology of college student for course selection based on data mining
CN104699798A (en) * 2015-03-18 2015-06-10 腾讯科技(深圳)有限公司 Sample data processing method and device
CN106294492A (en) * 2015-06-08 2017-01-04 深圳中兴网信科技有限公司 Data cleaning method and cleaning engine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256654A (en) * 2007-02-28 2008-09-03 陈国飞 Method for automatically inviting and tracing answer sheet of on-line questionnaire investigation system
CN102122294A (en) * 2011-01-29 2011-07-13 安徽工业大学 Survey research platform and method for psychology of college student for course selection based on data mining
CN104699798A (en) * 2015-03-18 2015-06-10 腾讯科技(深圳)有限公司 Sample data processing method and device
CN106294492A (en) * 2015-06-08 2017-01-04 深圳中兴网信科技有限公司 Data cleaning method and cleaning engine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王瑜: "计算机辅助问卷调查中敏感问题数据的质量评价", 《统计与决策》 *
聂风华等: "数据筛选在顾客满意度测评中的应用研究", 《现代管理科学》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726199A (en) * 2018-12-28 2019-05-07 杭州铭智云教育科技有限公司 A kind of data cleaning method
CN110046151A (en) * 2019-03-05 2019-07-23 努比亚技术有限公司 A kind of data cleaning method, server and computer readable storage medium
CN110046151B (en) * 2019-03-05 2023-08-11 努比亚技术有限公司 Data cleaning method, server and computer readable storage medium
CN110716512A (en) * 2019-09-02 2020-01-21 华电电力科学研究院有限公司 Environmental protection equipment performance prediction method based on coal-fired power plant operation data
CN110910231A (en) * 2019-11-06 2020-03-24 上海百事通信息技术股份有限公司 Debt clearing and collecting management platform
CN111427873A (en) * 2020-03-12 2020-07-17 无码科技(杭州)有限公司 Data cleaning method and system
CN111427873B (en) * 2020-03-12 2023-03-14 无码科技(杭州)有限公司 Data cleaning method and system
CN115203180A (en) * 2022-05-16 2022-10-18 北京航空航天大学 Data blood relationship generation method

Similar Documents

Publication Publication Date Title
CN108280096A (en) Data cleaning method and data cleansing device
US20080306715A1 (en) Detecting Method Over Network Intrusion
CN111008640A (en) Image recognition model training and image recognition method, device, terminal and medium
CN110533654A (en) The method for detecting abnormality and device of components
TW201710991A (en) Analytics system and method
CN111709371A (en) Artificial intelligence based classification method, device, server and storage medium
JP2020071665A (en) Behavior recognition method, behavior recognition program, and behavior recognition device
CN108241853A (en) A kind of video frequency monitoring method, system and terminal device
CN110262919A (en) Abnormal data analysis method, device, equipment and computer readable storage medium
CN112101572A (en) Model optimization method, device, equipment and medium
CN112488716A (en) Abnormal event detection system
CN114862832A (en) Method, device and equipment for optimizing defect detection model and storage medium
CN110287767A (en) Can attack protection biopsy method, device, computer equipment and storage medium
CN111652259B (en) Method and system for cleaning data
CN107783890A (en) Software defect data processing method and device
CN111582722B (en) Risk identification method and device, electronic equipment and readable storage medium
CN112365269A (en) Risk detection method, apparatus, device and storage medium
CN110716778B (en) Application compatibility testing method, device and system
CN108476147A (en) Automated method for managing computing system
CN110415779A (en) Insulation validation checking method, apparatus, equipment and storage medium
CN106294173A (en) Data processing method, device and server
CN110309737A (en) A kind of information processing method applied to cigarette sales counter, apparatus and system
CN110460711A (en) A kind of method and device that picture is shown
CN112906805A (en) Image training sample screening and task model training method and device and electronic equipment
CN114971110A (en) Method for determining root combination, related device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180713

RJ01 Rejection of invention patent application after publication