CN109697537A - The method and apparatus of data audit - Google Patents

The method and apparatus of data audit Download PDF

Info

Publication number
CN109697537A
CN109697537A CN201710985704.2A CN201710985704A CN109697537A CN 109697537 A CN109697537 A CN 109697537A CN 201710985704 A CN201710985704 A CN 201710985704A CN 109697537 A CN109697537 A CN 109697537A
Authority
CN
China
Prior art keywords
sample
labeler
task
mark
mark task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710985704.2A
Other languages
Chinese (zh)
Inventor
刘愉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710985704.2A priority Critical patent/CN109697537A/en
Publication of CN109697537A publication Critical patent/CN109697537A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Abstract

The invention discloses a kind of method and apparatus of data audit, are related to field of computer technology.One specific embodiment of this method includes: that mark task is distributed to labeler according to pre-defined rule, and mark task includes sample mark task, and each labeler is assigned sample mark task;After person to be marked completes distributed mark task, the annotation results of mark task are obtained;For each labeler, the sample annotation results for including in the annotation results of labeler are obtained, and audit to sample annotation results, then calculates the error rate of labeler;If error rate is no more than preset threshold, the annotation results audit of the labeler passes through.The embodiment can greatly reduce the cost of data audit, promote rate of return on investment, and can guarantee the Quality of Auditing of labeled data.

Description

The method and apparatus of data audit
Technical field
The present invention relates to the method and apparatus that field of computer technology more particularly to a kind of data are audited.
Background technique
In the epoch of artificial intelligence of today rapid development, speech recognition, image recognition, natural language processing, video point The technologies such as analysis have become the core competitiveness of artificial intelligence AI (Artificial Intelligence) company.Currently, most of In artificial intelligence analysis, general way is algorithm model training to be carried out using the data marked, and pass through not for company Disconnected innovatory algorithm carrys out Optimized model, with the consciousness of more preferable simulation people, the information process of thinking.Wherein, it has been marked in use When data carry out model training, the quality of labeled data is higher, quantity more mostly more can preferably training algorithm model.Therefore, The quality of labeled data needs to carry out stringent control.
Currently, when carrying out artificial intelligence analysis, the data to be marked that need to be labeled for example including sentence, image, Video, audio etc..The method that common several pairs of data are labeled includes: that the mark of sentence mainly carries out sentence Participle, classification, mark out the words and phrases of certain sense;The main body of picture is mainly marked out for the mark of image and text is known The contents such as not;The contents such as significant frame information are mainly marked out for the mark of audio, video etc..
In order to guarantee the high quality of labeled data, in addition to the Marking Guidelines and sample figure that pre-establish to labeler make reference Outside, it is also necessary to which the labeled data of labeler is audited.
The existing mode audited to labeled data, which places one's entire reliance upon, to be manually performed, mainly by being manually labeled The audit one by one of data;Or audited using sample mode by the sampling for being manually labeled data, it may be assumed that by being arranged centainly Sampling rate carries out sampling of data, by the labeled data sampled by manually auditing.
In the audit for being labeled data, for sentence, need to audit participle labeled data one by one according to Marking Guidelines Whether segment correct etc.;For image, need to check whether all main body marks have spill tag, mistake mark, is few according to Marking Guidelines Situations such as mark;For audio, need to listen audio data one by one, whether the text to check output correct etc..
In realizing process of the present invention, at least there are the following problems in the prior art for inventor's discovery:
1, the audit of data is labeled in the way of manually auditing one by one, under efficiency is very low, time cost is very high, And it needs to spend more human resources;
2, it is labeled the audit of data in the way of artificial sampling audit, it cannot be guaranteed that all labelers can be sampled Labeled data, and cannot be guaranteed labeled data Quality of Auditing.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method and apparatus of data audit, data can be greatly reduced Cost is audited, promotes rate of return on investment, and can guarantee the Quality of Auditing of labeled data.
To achieve the above object, according to an aspect of an embodiment of the present invention, a kind of method of data audit is provided.
A kind of method of data audit, comprising: mark task is distributed into labeler according to pre-defined rule, the mark is appointed Business includes that sample marks task, and each labeler is assigned the sample mark task;It is complete to the labeler After the mark task distributed, the annotation results of the mark task are obtained;For each labeler, the labeler is obtained Annotation results in include sample annotation results, and the sample annotation results are audited, then calculate the mark The error rate of person;If the error rate is no more than preset threshold, the annotation results audit of the labeler passes through.
Optionally, the mark task further includes non-sample mark task;Also, task point will be marked according to pre-defined rule The step of dispensing labeler includes: to mark task for sample, if the number S1 of sample mark task is less than labeler Number R, then first by the S1 sample mark task distribute to S1 labeler in the R labeler, then for Each labeler in remaining (R-S1) a labeler is chosen 1 sample mark from the S1 sample mark task and is appointed Business is allocated;If the number S1 of the sample mark task is not less than the number R of labeler, S1 is obtained into quotient to R remainder M1 and remainder N1 distributes M1 sample first for each labeler and marks task, remaining N1 sample is then marked task Distribute to N1 labeler in the R labeler;Task is marked for non-sample, if of non-sample mark task Number S2 is less than the number R of labeler, then S2 non-sample mark task is distributed to S2 mark in the R labeler Note person;If the number S2 of non-sample mark task is not less than the number R of labeler, by S2 to R remainder obtain quotient M2 and Remainder N2 distributes M2 non-sample first for each labeler and marks task, remaining N2 non-sample is then marked task Distribute to N2 labeler in the R labeler.
Optionally, the step of auditing to the sample annotation results includes: by by preset sample labeled data It is compared with the sample annotation results, to be audited to the sample annotation results;Wherein, the preset sample mark The generating process of note data includes: that sample mark task is distributed to sample labeler;It is completed to the sample labeler After the sample mark task, the sample annotation results of the sample mark task are obtained;When the sample annotation results are audited By when, the sample annotation results are determined as sample labeled data.
Optionally, if the step of sample mark task is distributed to sample labeler includes: that the sample mark is appointed The number S1 of business is less than the number R0 of sample labeler, then S1 sample mark task is distributed to the R0 sample mark S1 sample labeler in note person;If the number S1 of the sample mark task is not less than the number R0 of sample labeler, S1 is obtained into quotient M0 and remainder N0 to R0 remainder, M0 sample is distributed first for each sample labeler and marks task, then again Remaining N0 sample mark task is distributed into N0 sample labeler in the R0 sample labeler.
According to another aspect of an embodiment of the present invention, a kind of device of data audit is provided.
A kind of device of data audit, comprising: task allocating module, for mark task to be distributed to according to pre-defined rule Labeler, the mark task includes sample mark task, and each labeler is assigned the sample mark and appoints Business;As a result module is obtained, after completing distributed mark task after the labeler, obtains the mark of the mark task As a result;Statistical module is audited, for for each labeler, obtaining the sample for including in the annotation results of labeler mark As a result, and the sample annotation results are audited, then calculate the error rate of the labeler;As a result determining module is used If being no more than preset threshold in the error rate, the annotation results audit of the labeler passes through.
Optionally, the mark task further includes non-sample mark task;Also, the task allocating module is also used to: Task is marked for sample, if the number S1 of sample mark task is less than the number R of labeler, first by the S1 Sample mark task distributes to S1 labeler in the R labeler, then in remaining (R-S1) a labeler Each labeler, 1 sample mark task is chosen from the S1 sample mark task and is allocated;If the sample mark The number S1 of note task is not less than the number R of labeler, then S1 is obtained quotient M1 and remainder N1 to R remainder, be first each mark Note person distributes M1 sample and marks task, then distributes to remaining N1 sample mark task in the R labeler N1 labeler;Task is marked for non-sample, if the number S2 of non-sample mark task is less than the number R of labeler, S2 non-sample mark task is then distributed into S2 labeler in the R labeler;If the non-sample mark The number S2 of task is not less than the number R of labeler, then S2 is obtained quotient M2 and remainder N2 to R remainder, be first each mark Person distributes M2 non-sample and marks task, then distributes to remaining N2 non-sample mark task in the R labeler N2 labeler.
Optionally, the audit statistical module is also used to: by marking preset sample labeled data and the sample As a result it is compared, to be audited to the sample annotation results;Described device further include: sample data determining module is used In generating the preset sample labeled data, the generating process of the preset sample labeled data includes: by the sample Mark task distributes to sample labeler;After the sample labeler completes the sample mark task, the sample is obtained The sample annotation results of mark task;When sample annotation results audit passes through, the sample annotation results are determined as Sample labeled data.
Optionally, the sample data determining module is also used to: if the number S1 of sample mark task is less than sample S1 sample mark task is then distributed to S1 sample mark in the R0 sample labeler by the number R0 of labeler Note person;If the number S1 of the sample mark task is not less than the number R0 of sample labeler, S1 is obtained into quotient to R0 remainder M0 and remainder N0 distributes M0 sample first for each sample labeler and marks task, then again by remaining N0 sample mark Note task distributes to N0 sample labeler in the R0 sample labeler.
Another aspect according to an embodiment of the present invention provides a kind of electronic equipment of data audit.
A kind of electronic equipment of data audit, comprising: one or more processors;Storage device, for store one or Multiple programs, when one or more of programs are executed by one or more of processors, so that one or more of places Manage the method that device realizes the audit of data provided by the embodiment of the present invention.
It is according to an embodiment of the present invention in another aspect, providing a kind of computer-readable medium.
A kind of computer-readable medium is stored thereon with computer program, realizes this when described program is executed by processor The method of the audit of data provided by inventive embodiments.
One embodiment in foregoing invention has the following advantages that or the utility model has the advantages that by using sample labeled data, right The sample annotation results for including in the annotation results of each labeler are audited, and then calculate the error rate of labeler, and work as When error rate is met the requirements, that is, thinks that all annotation results of the labeler are audited and pass through, to realize based on sample mark The annotation results of each labeler can be sampled audit, guarantee labeled data by the data checking method for infusing data Quality of Auditing;Data audit cost can be greatly reduced simultaneously, promote rate of return on investment.
Further effect possessed by above-mentioned non-usual optional way adds hereinafter in conjunction with specific embodiment With explanation.
Detailed description of the invention
Attached drawing for a better understanding of the present invention, does not constitute an undue limitation on the present invention.Wherein:
Fig. 1 is the schematic diagram of the main flow of the method for data audit according to an embodiment of the present invention;
Fig. 2 is the realization process schematic of the method for the data audit of the embodiment of the present invention;
Fig. 3 is the schematic diagram of the main modular of the device of data audit according to an embodiment of the present invention;
Fig. 4 is that the embodiment of the present invention can be applied to exemplary system architecture figure therein;
Fig. 5 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present invention Figure.
Specific embodiment
Below in conjunction with attached drawing, an exemplary embodiment of the present invention will be described, including the various of the embodiment of the present invention Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize It arrives, it can be with various changes and modifications are made to the embodiments described herein, without departing from scope and spirit of the present invention.Together Sample, for clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.
For there is the machine learning of supervision, it usually needs carry out training algorithm model using a large amount of labeled data, and mark The quality height for infusing data will directly influence the quality of algorithm model, therefore have become extremely to the examination of labeled data It is important.But the existing checking method to labeled data needs with inefficiency in order to guarantee the quality of data, expends a large amount of manpowers Cost is cost.In order to solve this problem, the present invention provides a kind of machines based on sample data to audit labeled data automatically Method, review efficiency can greatlyd improve, and under the premise of saving human resources, meet the quality requirement of auditing result. In the present invention, mark task can be completed by people, can also be realized by program code, that is: appeared in the present invention Labeler or sample labeler can be people, also may be implemented as one section of program code.
Fig. 1 is the schematic diagram of the main flow of the method for data audit according to an embodiment of the present invention.As shown in Figure 1, this The method of the data audit of inventive embodiments mainly includes the following steps, namely S101 to step S104.
Step S101: distributing to labeler for mark task according to pre-defined rule, and mark task includes sample mark task, And each labeler is assigned sample mark task;
Step S102: after person to be marked completes distributed mark task, the annotation results of mark task are obtained;
Step S103: for each labeler, the sample annotation results for including in the annotation results of labeler are obtained, and right Sample annotation results are audited, and the error rate of labeler is then calculated;
Step S104: if error rate is no more than preset threshold, the annotation results audit of the labeler passes through.
Technical solution according to an embodiment of the present invention can also include that non-sample marks task in mark task;Also, it walks Rapid S101 when mark task is distributed to labeler according to pre-defined rule, according to mark task be sample mark task also or Non-sample marks task, is labeled the distribution of task respectively using different methods.
Task is marked for sample, if the number S1 of sample mark task is less than the number R of labeler, first by S1 Sample mark task distributes to S1 labeler in R labeler, then for every in remaining (R-S1) a labeler A labeler is chosen 1 sample mark task from S1 sample mark task and is allocated;If the number of sample mark task S1 is not less than the number R of labeler, then S1 is obtained quotient M1 and remainder N1 to R remainder, distributes M1 first for each labeler Sample marks task, remaining N1 sample mark task is then distributed to N1 labeler in R labeler.In the general It, can be according to the distribution pre-established when distributing sample mark task in the step of sample mark task distributes to labeler Rule is allocated, and allocation rule for example can be and be randomly assigned, perhaps successively sequentially distribute or equivalent interval distribution etc. Deng this needs to determine according to application.
Task is marked for non-sample, if the number S2 of non-sample mark task is less than the number R of labeler, by S2 Non-sample mark task distributes to S2 labeler in R labeler;If non-sample marks the number S2 of task not less than mark S2 is then obtained quotient M2 and remainder N2 to R remainder by the number R of note person, distributes M2 non-sample mark first for each labeler Then remaining N2 non-sample mark task is distributed to N2 labeler in R labeler by task.This by non-sample In the step of mark task distributes to labeler, when distributing non-sample mark task, it can be advised according to the distribution pre-established Then be allocated, allocation rule for example can be and be randomly assigned, perhaps successively sequentially distribute or equivalent interval distribution etc., This needs to determine according to application.
In addition, according to one embodiment of present invention, in step S103 the step of audit to sample annotation results Specifically may is that by the way that preset sample labeled data to be compared with sample annotation results, to sample annotation results into Row audit.Wherein, the generating process of preset sample labeled data mainly may include following step:
Sample mark task is distributed into sample labeler;
After sample labeler completes sample mark task, the sample annotation results of sample mark task are obtained;
When the audit of sample annotation results passes through, sample annotation results are determined as sample labeled data.
The technical solution of embodiment according to the present invention, preset sample labeled data, which can be, will mark task distribution It is predetermined that data mark is carried out to labeler;It is also possible to after labeler completes mark task, in annotation results Sample annotation results audit it is predetermined, so can be by will be in preset sample labeled data and annotation results Sample annotation results be compared to carry out sample annotation results audit.
Wherein, when sample mark task is distributed to sample labeler, two kinds of situations below can be specifically divided into It executes respectively:
If the number S1 that sample marks task is less than the number R0 of sample labeler, by S1 sample mark task distribution To S1 sample labeler in R0 sample labeler;
If the number S1 that sample marks task is not less than the number R0 of sample labeler, S1 is obtained into quotient M0 to R0 remainder With remainder N0, M0 sample is distributed for each sample labeler first and marks task, then again by remaining N0 sample mark Task distributes to N0 sample labeler in R0 sample labeler.
In above-mentioned the step of sample mark task is distributed to sample labeler, when distributing sample mark task, It can be allocated according to the allocation rule pre-established, allocation rule for example can be and be randomly assigned, or successively sequentially divide Match or equivalent interval distribution etc., this needs to determine according to application.
According to above step S101 to step S104, by using sample labeled data, to the mark of each labeler As a result the sample annotation results for including in are audited, and then calculate the error rate of labeler, and when error rate is met the requirements, Think that all annotation results of the labeler are audited to pass through, to realize the data audit side based on sample labeled data Method can be sampled audit to the annotation results of each labeler, guarantee the Quality of Auditing of labeled data;It simultaneously being capable of pole The earth reduces data and audits cost, promotes rate of return on investment.
Fig. 2 is the realization process schematic of the method for the data audit of the embodiment of the present invention.In this embodiment, labeler Taking human as example, and with before it will mark task and distribute to labeler progress data mark, predefining sample labeled data is Example introduces specific implementation process.As shown in Fig. 2, the realization process of the embodiment mainly may include following steps.
Step S201: creation mark task: receiving a certain number of mark tasks that task management person uploads, and marks task In include S1 sample mark task and S2 non-sample mark task, and save the sample labeler that task management person is arranged The parameters such as number R0, the number R of labeler and serious forgiveness m% (preset error thresholds) are to complete the creation of mark task.Wound After building mark task, task status for example can be in sample mark.
Step S202: distribution sample marks task: if sample marks task number S1 < sample labeler number R0, only S1 sample mark task need to be distributed to S1 people in R0, everyone divides a sample mark task, and remaining (R0-S1) is a People does not distribute sample mark task then, is not involved in the mark of sample task;If S1 >=R0, need S1 carrying out remainder fortune to R0 Calculation obtains quotient M0 and remainder N0, distributes M0 sample first for everyone and marks task, then by way of repeating query, by remaining N0 sample mark task distributes to N0 people in R0 sample labeler, to complete the distribution of sample mark task, so It can guarantee that everyone can be assigned to the sample mark task of approximate equivalent to the maximum extent.
It, can be with when distributing sample mark task in the step of sample mark task is distributed to sample labeler by this It being allocated according to the allocation rule pre-established, allocation rule for example can be and be randomly assigned, or successively sequentially distribute, or Person's equivalent interval distribution etc., this needs to determine according to application.
Step S203: it determines sample labeled data: after all sample labelers complete mark, sample annotation results being mentioned Task management person is given, audits the audit that rule file carries out sample annotation results according to Marking Guidelines etc. by task management person. In review process, if some sample annotation results occur has spill tag, few mark, wrong mark (including tab area is incorrect or mark Label value is incorrect etc.) situations such as, then the personnel that corresponding sample mark task directly rejects the mark to before are marked again Note.Until the sample annotation results of all sample labelers all meet auditing rule, i.e., all sample annotation results are audited By when, the sample annotation results that pass through of audit are determined as sample labeled data.
Step S204: it distribution mark task: after sample labeled data has been determined, can will create in step S201 Mark task is distributed to R labeler and is labeled, and wherein includes S1 sample mark task and S2 non-samples in mark task This mark task (refers to common mark task), at this point, be will not no marked content S1 sample mark task and S2 A non-sample mark task distributes to R labeler together.Here R labeler does not have when being chosen with sample labeler There is inevitable connection, can be the R labeler different from the R0 sample labeler, also may include the R0 sample mark All or part in person is chosen according to the demand in practical application.Generally, due to which it is very huge to mark task Greatly, the sample mark task chosen is only wherein small portion, and therefore, R is typically larger than R0.
Technical solution according to an embodiment of the present invention, in order to ensure each labeler is each assigned to sample mark task, So as to the subsequent error rate for calculating labeler according to sample annotation results, the present invention is needed to S1 sample mark task and S2 Non-sample mark task carries out task distribution respectively.
The assigning process that sample marks task is as follows: if sample marks task number S1 < labeler number R, first will S1 sample mark task distributes to S1 people in R labeler, that is to say, that S1 people in R labeler, everyone It is first assigned 1 sample and marks task;It then, can be from S1 for each labeler in remaining (R-S1) a labeler 1 sample mark task is chosen in sample mark task to be allocated.If sample marks task number S1 >=labeler number R, S1 is then obtained into quotient M1 and remainder N1 to R remainder, M1 sample is distributed first for each labeler and marks task, then by remaining N1 sample mark task distribute to N1 labeler in R labeler, be assigned up to S1 sample marks task. According to assigning process as above it is found that each labeler is assigned at least one sample mark task.
It, can basis when distributing sample mark task in the step of sample mark task is distributed to labeler by this The allocation rule pre-established is allocated, and allocation rule for example can be and be randomly assigned, and is perhaps successively sequentially distributed or is waited Amount interval distribution etc., this needs to determine according to application.
The assigning process that non-sample marks task is as follows: if non-sample marks task number S2 < labeler number R, only S2 non-sample mark task need to be distributed to S2 labeler in R labeler, also the S2 as in R labeler Everyone distributes a non-sample mark task a labeler, and remaining (R-S2) a labeler will not then be assigned non-sample mark Task.If non-sample mark task number S2 >=labeler number R, S2 is obtained into quotient M2 and remainder N2 to R remainder, first for Each labeler distributes M2 non-sample and marks task, remaining N2 non-sample is then marked task by way of repeating query N2 labeler in R labeler is distributed to, until S2 non-sample mark task is assigned.
It, can be with when distributing non-sample mark task in the step of non-sample mark task is distributed to labeler by this It being allocated according to the allocation rule pre-established, allocation rule for example can be and be randomly assigned, or successively sequentially distribute, or Person's equivalent interval distribution etc., this needs to determine according to application.
Step S205: after all labelers complete mark, annotation results audit annotation results: are submitted into audit system System, the automatic audit of result is labeled by auditing system according to scheduled auditing rule.Auditing rule of the invention is to be based on The automatic audit algorithm of sample labeled data, the sample mark in the audit for being labeled result, first in acquisition annotation results Note marks sample as a result, then by being compared preset sample labeled data with the sample annotation results of acquisition As a result it is audited, and using the auditing result of sample annotation results as the auditing result of the annotation results of the labeler.Wherein, Automatic audit algorithm needs are set according to the type of labeled data, such as: if labeled data is text, need to text into Row participle, the audit of result is labeled according to the content of participle and participle position;If labeled data is picture, root is needed The audit of result is labeled according to the position of picture mark and label value etc.;If labeled data is audio file, need by Audio file is converted into text file, and the audit, etc. for being labeled result according to text file content.
Step S206: it calculates error rate: according to the auditing result of sample annotation results, determining the sample mark of each labeler Error rate is infused, and using the corresponding error rate of the sample marking error rate annotation results all as the labeler.
Under normal circumstances, mark task is very large, and in order to economize on resources, labeler is often limited, because This each labeler can be assigned many mark tasks.When choosing sample mark task from mark task, can also consider The factors such as the computational accuracy of number and error rate of labeler and sample labeler, and then determine the sample mark of suitable magnitude Note task.
Step S207: the data that pass through of audit are determined:, will wherein error rate≤fault-tolerant according to the error rate of each labeler The annotation results that the labeler of rate is submitted are determined as the data that audit passes through, and do not audit the corresponding mark of the annotation results passed through and appoint Business will be marked again by rejecting automatically to corresponding labeler.
In special circumstances, the labeled data that system audit passes through or rejects manually can be forced through or reject.Such as: if The assigned sample mark task of some labeler only one, then error rate is likely to be greater than fault-tolerant threshold value, but be non-sample This mark task has very much, and error rate is far below fault-tolerant threshold value, then, it can manually force audit to pass through the labeler at this time Annotation results.
Finally, when all mark tasks all mark complete and after the approval, corresponding annotation results can be used to train Algorithm model.
Fig. 3 is the schematic diagram of the main modular of the device of data audit according to an embodiment of the present invention.As shown in figure 3, this The device 300 of the data audit of invention mainly includes task allocating module 301, result acquisition module 302, audit statistical module 303 and result determining module 304.
Task allocating module 301 is used to that mark task to be distributed to labeler according to pre-defined rule, and mark task includes sample This mark task, and each labeler is assigned sample mark task;
As a result after obtaining the mark task distributed for person to be marked completion of module 302, the mark of mark task is obtained As a result;
Statistical module 303 is audited to be used to obtain the sample mark for including in the annotation results of labeler for each labeler Note as a result, and sample annotation results are audited, then calculate labeler error rate;
If as a result determining module 304 is no more than preset threshold for error rate, the annotation results audit of the labeler is logical It crosses.
According to an embodiment of the invention, mark task further includes non-sample mark task;Also, task allocating module 301 It can be also used for:
Task is marked for sample, if the number S1 of sample mark task is less than the number R of labeler, first by S1 Sample mark task distributes to S1 labeler in R labeler, then for every in remaining (R-S1) a labeler A labeler is chosen 1 sample mark task from S1 sample mark task and is allocated;If the number of sample mark task S1 is not less than the number R of labeler, then S1 is obtained quotient M1 and remainder N1 to R remainder, distributes M1 first for each labeler Sample marks task, remaining N1 sample mark task is then distributed to N1 labeler in R labeler;
Task is marked for non-sample, if the number S2 of non-sample mark task is less than the number R of labeler, by S2 Non-sample mark task distributes to S2 labeler in R labeler;If non-sample marks the number S2 of task not less than mark S2 is then obtained quotient M2 and remainder N2 to R remainder by the number R of note person, distributes M2 non-sample mark first for each labeler Then remaining N2 non-sample mark task is distributed to N2 labeler in R labeler by task.
Technical solution according to an embodiment of the present invention, audit statistical module 303 can be also used for:
By the way that preset sample labeled data to be compared with sample annotation results, to examine sample annotation results Core;
Wherein, the device 300 of data of the invention audit can also include: that sample data determining module (is not shown in figure Out), for generating preset sample labeled data, the generating process of preset sample labeled data includes:
Sample mark task is distributed into sample labeler;
After sample labeler completes sample mark task, the sample annotation results of sample mark task are obtained;
When the audit of sample annotation results passes through, sample annotation results are determined as sample labeled data.
Wherein, sample data determining module can be also used for:
If the number S1 that sample marks task is less than the number R0 of sample labeler, by S1 sample mark task distribution To S1 sample labeler in R0 sample labeler;
If the number S1 that sample marks task is not less than the number R0 of sample labeler, S1 is obtained into quotient M0 to R0 remainder With remainder N0, M0 sample is distributed for each sample labeler first and marks task, then again by remaining N0 sample mark Task distributes to N0 sample labeler in R0 sample labeler.
Technical solution according to an embodiment of the present invention, by using sample labeled data, to the mark knot of each labeler The sample annotation results for including in fruit are audited, and then calculate the error rate of labeler, and when error rate is met the requirements, i.e., Think that all annotation results of the labeler are audited to pass through, to realize the data audit side based on sample labeled data Method can be sampled audit to the annotation results of each labeler, guarantee the Quality of Auditing of labeled data;It simultaneously being capable of pole The earth reduces data and audits cost, promotes rate of return on investment.
Fig. 4 is shown can be using the exemplary of the device for the method or data audit that the data of the embodiment of the present invention are audited System architecture 400.
As shown in figure 4, system architecture 400 may include terminal device 401,402,403, network 404 and server 405. Network 404 between terminal device 401,402,403 and server 405 to provide the medium of communication link.Network 404 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 401,402,403 and be interacted by network 404 with server 405, to receive or send out Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 401,402,403 (merely illustrative) such as the application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform softwares.
Terminal device 401,402,403 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 405 can be to provide the server of various services, such as utilize terminal device 401,402,403 to user The shopping class website browsed provides the back-stage management server (merely illustrative) supported.Back-stage management server can be to reception To the data such as information query request analyze etc. processing, and by processing result (such as target push information, product letter Breath -- merely illustrative) feed back to terminal device.
It should be noted that the method for the audit of data provided by the embodiment of the present invention is generally executed by server 405, phase The device of Ying Di, data audit are generally positioned in server 405.
It should be understood that the number of terminal device, network and server in Fig. 4 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
Below with reference to Fig. 5, it illustrates the calculating of the terminal device or server that are suitable for being used to realize the embodiment of the present invention The structural schematic diagram of machine system 500.Terminal device or server shown in Fig. 5 are only an example, should not be to of the invention real The function and use scope for applying example bring any restrictions.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in Program in memory (ROM) 502 or be loaded into the program in random access storage device (RAM) 503 from storage section 508 and Execute various movements appropriate and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data. CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always Line 504.
I/O interface 505 is connected to lower component: the importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 508 including hard disk etc.; And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because The network of spy's net executes communication process.Driver 510 is also connected to I/O interface 505 as needed.Detachable media 511, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 510, in order to read from thereon Computer program be mounted into storage section 508 as needed.
Particularly, disclosed embodiment, the process described above with reference to flow chart may be implemented as counting according to the present invention Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer Computer program on readable medium, the computer program include the program code for method shown in execution flow chart.? In such embodiment, which can be downloaded and installed from network by communications portion 509, and/or from can Medium 511 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 501, system of the invention is executed The above-mentioned function of middle restriction.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
Being described in the embodiment of the present invention involved unit or module can be realized by way of software, can also be with It is realized by way of hardware.Described unit or module also can be set in the processor, for example, can be described as: A kind of processor includes that task allocating module, result obtain module, audit statistical module and result determining module.Wherein, these The title of unit or module does not constitute the restriction to the unit or module itself under certain conditions, for example, task distributes mould Block is also described as that " for mark task to be distributed to labeler according to pre-defined rule, the mark task includes sample Mark task, and each labeler is assigned the module of the sample mark task ".
As on the other hand, the present invention also provides a kind of computer-readable medium, which be can be Included in equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying equipment.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, makes Obtaining the equipment includes: that mark task is distributed to labeler according to pre-defined rule, and the mark task includes sample mark task, And each labeler is assigned the sample mark task;Distributed mark task is completed to the labeler Afterwards, the annotation results of the mark task are obtained;For each labeler, obtains in the annotation results of the labeler and include Sample annotation results, and the sample annotation results are audited, then calculate the error rate of the labeler;If the mistake Accidentally rate is no more than preset threshold, then the annotation results audit of the labeler passes through.
Technical solution according to an embodiment of the present invention, by using sample labeled data, to the mark knot of each labeler The sample annotation results for including in fruit are audited, and then calculate the error rate of labeler, and when error rate is met the requirements, i.e., Think that all annotation results of the labeler are audited to pass through, to realize the data audit side based on sample labeled data Method can be sampled audit to the annotation results of each labeler, guarantee the Quality of Auditing of labeled data;It simultaneously being capable of pole The earth reduces data and audits cost, promotes rate of return on investment.
Above-mentioned specific embodiment, does not constitute a limitation on the scope of protection of the present invention.Those skilled in the art should be bright It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and substitution can occur.It is any Made modifications, equivalent substitutions and improvements etc. within the spirit and principles in the present invention, should be included in the scope of the present invention Within.

Claims (10)

1. a kind of method of data audit characterized by comprising
Mark task is distributed into labeler according to pre-defined rule, the mark task includes that sample marks task, and each institute It states labeler and is assigned the sample mark task;
After completing distributed mark task after the labeler, the annotation results of the mark task are obtained;
For each labeler, the sample annotation results for including in the annotation results of the labeler are obtained, and to the sample Annotation results are audited, and the error rate of the labeler is then calculated;
If the error rate is no more than preset threshold, the annotation results audit of the labeler passes through.
2. the method according to claim 1, wherein
The mark task further includes non-sample mark task;Also,
The step of mark task is distributed to labeler according to pre-defined rule include:
Task is marked for sample, it, first will be described if the number S1 of sample mark task is less than the number R of labeler S1 sample mark task distributes to S1 labeler in the R labeler, then for remaining (R-S1) a mark Each labeler in person is chosen 1 sample mark task from the S1 sample mark task and is allocated;If the sample The number S1 of this mark task is not less than the number R of labeler, then S1 is obtained quotient M1 and remainder N1 to R remainder, be first every A labeler distributes M1 sample and marks task, remaining N1 sample mark task is then distributed to the R labeler In N1 labeler;
Task is marked for non-sample, it, will be described if the number S2 of non-sample mark task is less than the number R of labeler S2 non-sample mark task distributes to S2 labeler in the R labeler;If of the non-sample mark task Number S2 is not less than the number R of labeler, then S2 is obtained quotient M2 and remainder N2 to R remainder, distributes M2 first for each labeler A non-sample marks task, remaining N2 non-sample mark task is then distributed to N2 mark in the R labeler Note person.
3. the method according to claim 1, wherein the step of being audited to sample annotation results packet It includes:
By the way that preset sample labeled data is compared with the sample annotation results, to the sample annotation results into Row audit;
Wherein, the generating process of the preset sample labeled data includes:
Sample mark task is distributed into sample labeler;
After the sample labeler completes the sample mark task, the sample mark knot of the sample mark task is obtained Fruit;
When sample annotation results audit passes through, the sample annotation results are determined as sample labeled data.
4. according to the method described in claim 3, it is characterized in that, sample mark task is distributed to sample labeler Step includes:
If the number S1 of the sample mark task is less than the number R0 of sample labeler, the S1 sample is marked into task Distribute to S1 sample labeler in the R0 sample labeler;
If the number S1 of the sample mark task is not less than the number R0 of sample labeler, S1 is obtained into quotient M0 to R0 remainder With remainder N0, M0 sample is distributed for each sample labeler first and marks task, then again by remaining N0 sample mark Task distributes to N0 sample labeler in the R0 sample labeler.
5. a kind of device of data audit characterized by comprising
Task allocating module, for mark task to be distributed to labeler according to pre-defined rule, the mark task includes sample Mark task, and each labeler is assigned the sample mark task;
As a result module is obtained, after completing distributed mark task after the labeler, obtains the mark of the mark task Infuse result;
Statistical module is audited, for for each labeler, obtaining the sample for including in the annotation results of labeler mark As a result, and the sample annotation results are audited, then calculate the error rate of the labeler;
As a result determining module, if being no more than preset threshold for the error rate, the annotation results audit of the labeler is logical It crosses.
6. device according to claim 5, which is characterized in that
The mark task further includes non-sample mark task;Also,
The task allocating module is also used to:
Task is marked for sample, it, first will be described if the number S1 of sample mark task is less than the number R of labeler S1 sample mark task distributes to S1 labeler in the R labeler, then for remaining (R-S1) a mark Each labeler in person is chosen 1 sample mark task from the S1 sample mark task and is allocated;If the sample The number S1 of this mark task is not less than the number R of labeler, then S1 is obtained quotient M1 and remainder N1 to R remainder, be first every A labeler distributes M1 sample and marks task, remaining N1 sample mark task is then distributed to the R labeler In N1 labeler;
Task is marked for non-sample, it, will be described if the number S2 of non-sample mark task is less than the number R of labeler S2 non-sample mark task distributes to S2 labeler in the R labeler;If of the non-sample mark task Number S2 is not less than the number R of labeler, then S2 is obtained quotient M2 and remainder N2 to R remainder, distributes M2 first for each labeler A non-sample marks task, remaining N2 non-sample mark task is then distributed to N2 mark in the R labeler Note person.
7. device according to claim 5, which is characterized in that the audit statistical module is also used to:
By the way that preset sample labeled data is compared with the sample annotation results, to the sample annotation results into Row audit;
Described device further include: sample data determining module, it is described preset for generating the preset sample labeled data The generating process of sample labeled data includes:
Sample mark task is distributed into sample labeler;
After the sample labeler completes the sample mark task, the sample mark knot of the sample mark task is obtained Fruit;
When sample annotation results audit passes through, the sample annotation results are determined as sample labeled data.
8. device according to claim 7, which is characterized in that the sample data determining module is also used to:
If the number S1 of the sample mark task is less than the number R0 of sample labeler, the S1 sample is marked into task Distribute to S1 sample labeler in the R0 sample labeler;
If the number S1 of the sample mark task is not less than the number R0 of sample labeler, S1 is obtained into quotient M0 to R0 remainder With remainder N0, M0 sample is distributed for each sample labeler first and marks task, then again by remaining N0 sample mark Task distributes to N0 sample labeler in the R0 sample labeler.
9. a kind of electronic equipment of data audit characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-4.
10. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-4 is realized when row.
CN201710985704.2A 2017-10-20 2017-10-20 The method and apparatus of data audit Pending CN109697537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710985704.2A CN109697537A (en) 2017-10-20 2017-10-20 The method and apparatus of data audit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710985704.2A CN109697537A (en) 2017-10-20 2017-10-20 The method and apparatus of data audit

Publications (1)

Publication Number Publication Date
CN109697537A true CN109697537A (en) 2019-04-30

Family

ID=66226457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710985704.2A Pending CN109697537A (en) 2017-10-20 2017-10-20 The method and apparatus of data audit

Country Status (1)

Country Link
CN (1) CN109697537A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245716A (en) * 2019-06-20 2019-09-17 杭州睿琪软件有限公司 Sample labeling auditing method and device
CN110335251A (en) * 2019-05-31 2019-10-15 上海联影智能医疗科技有限公司 Quantization device, method, equipment and the storage medium of image analysis method
CN110378617A (en) * 2019-07-26 2019-10-25 中国工商银行股份有限公司 A kind of sample mask method, device, storage medium and equipment
CN110516252A (en) * 2019-08-30 2019-11-29 京东方科技集团股份有限公司 Data mask method, device, computer equipment and storage medium
CN110781583A (en) * 2019-10-10 2020-02-11 北京字节跳动网络技术有限公司 Audit mode optimization method and device and electronic equipment
WO2020253740A1 (en) * 2019-06-20 2020-12-24 杭州睿琪软件有限公司 Manual client status check method and device for sample verification
CN112270532A (en) * 2020-11-12 2021-01-26 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112686009A (en) * 2020-12-23 2021-04-20 中国人民解放军战略支援部队信息工程大学 Voice marking system and method
CN114219501A (en) * 2022-02-22 2022-03-22 杭州衡泰技术股份有限公司 Sample labeling resource allocation method, device and application

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324620B (en) * 2012-03-20 2016-04-27 北京百度网讯科技有限公司 A kind of method and apparatus that annotation results is rectified a deviation
CN105678325A (en) * 2015-12-31 2016-06-15 哈尔滨工业大学深圳研究生院 Textual emotion marking method, device and system
CN105975980A (en) * 2016-04-27 2016-09-28 百度在线网络技术(北京)有限公司 Method of monitoring image mark quality and apparatus thereof
CN106601228A (en) * 2016-12-09 2017-04-26 百度在线网络技术(北京)有限公司 Sample marking method and device based on artificial intelligence prosody prediction
CN107256428A (en) * 2017-05-25 2017-10-17 腾讯科技(深圳)有限公司 Data processing method, data processing equipment, storage device and the network equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324620B (en) * 2012-03-20 2016-04-27 北京百度网讯科技有限公司 A kind of method and apparatus that annotation results is rectified a deviation
CN105678325A (en) * 2015-12-31 2016-06-15 哈尔滨工业大学深圳研究生院 Textual emotion marking method, device and system
CN105975980A (en) * 2016-04-27 2016-09-28 百度在线网络技术(北京)有限公司 Method of monitoring image mark quality and apparatus thereof
CN106601228A (en) * 2016-12-09 2017-04-26 百度在线网络技术(北京)有限公司 Sample marking method and device based on artificial intelligence prosody prediction
CN107256428A (en) * 2017-05-25 2017-10-17 腾讯科技(深圳)有限公司 Data processing method, data processing equipment, storage device and the network equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335251A (en) * 2019-05-31 2019-10-15 上海联影智能医疗科技有限公司 Quantization device, method, equipment and the storage medium of image analysis method
CN110335251B (en) * 2019-05-31 2021-09-17 上海联影智能医疗科技有限公司 Quantization apparatus, method, device and storage medium for image analysis method
CN110245716B (en) * 2019-06-20 2021-05-14 杭州睿琪软件有限公司 Sample labeling auditing method and device
CN110245716A (en) * 2019-06-20 2019-09-17 杭州睿琪软件有限公司 Sample labeling auditing method and device
WO2020253740A1 (en) * 2019-06-20 2020-12-24 杭州睿琪软件有限公司 Manual client status check method and device for sample verification
CN110378617A (en) * 2019-07-26 2019-10-25 中国工商银行股份有限公司 A kind of sample mask method, device, storage medium and equipment
CN110516252A (en) * 2019-08-30 2019-11-29 京东方科技集团股份有限公司 Data mask method, device, computer equipment and storage medium
CN110781583A (en) * 2019-10-10 2020-02-11 北京字节跳动网络技术有限公司 Audit mode optimization method and device and electronic equipment
CN110781583B (en) * 2019-10-10 2023-04-18 北京字节跳动网络技术有限公司 Audit mode optimization method and device and electronic equipment
CN112270532A (en) * 2020-11-12 2021-01-26 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112270532B (en) * 2020-11-12 2023-07-28 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and storage medium
CN112686009A (en) * 2020-12-23 2021-04-20 中国人民解放军战略支援部队信息工程大学 Voice marking system and method
CN114219501A (en) * 2022-02-22 2022-03-22 杭州衡泰技术股份有限公司 Sample labeling resource allocation method, device and application

Similar Documents

Publication Publication Date Title
CN109697537A (en) The method and apparatus of data audit
CN110119413A (en) The method and apparatus of data fusion
CN108171276A (en) For generating the method and apparatus of information
CN110472207A (en) List generation method and device
CN109976997A (en) Test method and device
CN108121699A (en) For the method and apparatus of output information
CN110348771A (en) The method and apparatus that a kind of pair of order carries out group list
CN108776692A (en) Method and apparatus for handling information
CN110033337A (en) The method and apparatus of order production
CN111339743B (en) Account number generation method and device
CN110232487A (en) A kind of task allocating method and device
CN109002385A (en) Method for testing pressure and device for data flow system
CN111444077A (en) Method and device for generating flow node test data
CN110400029A (en) A kind of method and system of mark management
CN110309142A (en) The method and apparatus of regulation management
CN109284367A (en) Method and apparatus for handling text
CN110263791A (en) A kind of method and apparatus in identification function area
CN109002925A (en) Traffic prediction method and apparatus
CN110110153A (en) A kind of method and apparatus of node searching
CN110245014A (en) Data processing method and device
CN107844931A (en) Information processing method and device
CN108062423B (en) Information-pushing method and device
CN109657073A (en) Method and apparatus for generating information
CN110472055B (en) Method and device for marking data
CN109766089A (en) Code generating method, device, electronic equipment and storage medium based on cardon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190430