WO2020253742A1 - Sample labeling checking method and device - Google Patents

Sample labeling checking method and device Download PDF

Info

Publication number
WO2020253742A1
WO2020253742A1 PCT/CN2020/096647 CN2020096647W WO2020253742A1 WO 2020253742 A1 WO2020253742 A1 WO 2020253742A1 CN 2020096647 W CN2020096647 W CN 2020096647W WO 2020253742 A1 WO2020253742 A1 WO 2020253742A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample area
review
labeled
pictures
sample
Prior art date
Application number
PCT/CN2020/096647
Other languages
French (fr)
Chinese (zh)
Inventor
徐青松
李青
Original Assignee
杭州睿琪软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿琪软件有限公司 filed Critical 杭州睿琪软件有限公司
Publication of WO2020253742A1 publication Critical patent/WO2020253742A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the present invention relates to the field of artificial intelligence technology, in particular to a sample labeling and reviewing method, device, electronic equipment and computer readable storage medium.
  • the training samples need to be annotated before model training.
  • a manual client or a recognition model can be used to label the training samples, but this cannot guarantee the labeling accuracy of the samples.
  • the purpose of the present invention is to provide a sample labeling review method, device, electronic equipment and computer readable storage medium to improve the accuracy of sample labeling.
  • the specific technical solutions are as follows:
  • the present invention provides a sample labeling review method, which includes:
  • Step 1 Obtain samples to be labeled
  • Step 2 Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
  • Step 3 Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
  • Step 4 Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
  • Step 5 Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the pre-labeled sample area picture is sent to a manual client, so that the manual client can review the pre-labeled result of the sample area picture.
  • the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  • step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • sample area pictures pre-labeled by different preset recognition models are sent to each of a plurality of different manual clients, so that each manual client can review the pre-labeled results of each sample area picture.
  • the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  • step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • Step 4 Send the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets.
  • the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
  • step 3 after the pre-labeling processing in step 3, it also includes:
  • step 4 the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • step 5 After the verification client in step 5 performs verification processing on the label information of the sample area picture reviewed by the review unit, it also includes:
  • the verification client determines whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
  • the review unit reviews the pre-labeled results of the sample area pictures of the unmodified pre-labeled results and the sample area pictures of the modified pre-labeled results, including:
  • the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified.
  • the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  • the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the present invention also provides a sample labeling and reviewing device, which includes:
  • a recognition module configured to recognize at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
  • the pre-labeling module is used to identify each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model;
  • the review module is configured to send the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, if the pre-labeled result is verified as If it is wrong, modify the pre-marked result;
  • the verification module is configured to send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
  • the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  • the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
  • the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  • the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
  • the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets;
  • the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
  • pre-labeling module performs pre-labeling processing, it is also used to:
  • the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified;
  • the verification client After the verification client performs verification processing on the annotation information of the sample area picture that has been reviewed by the review unit, it is also used to:
  • the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  • the verification client in the verification module checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the present invention also provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus.
  • the memory is used to store computer programs
  • the processor is configured to implement the sample labeling review method described in the first aspect when executing the computer program stored on the memory.
  • the present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the sample labeling described in the first aspect is implemented. Audit method.
  • sample labeling and reviewing method, device, electronic equipment and computer-readable storage medium provided by the present invention have the following beneficial effects:
  • the present invention first recognizes at least one region of the sample to be labeled through the region recognition model, and cuts to form at least a sample region picture, and then pre-labels each sample region picture through the preset recognition model, and then The pre-annotation results of the sample area pictures are reviewed by the review unit. If the pre-annotation results are found to be incorrect, the pre-annotation results will be modified. Finally, the verification client will verify the results of the pre-marking The annotation information of the pictures in the sample area is verified. It can be seen that the present invention labels samples according to the process of pre-labeling, review, and verification, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
  • FIG. 1 is a schematic flowchart of a sample labeling and reviewing method provided by an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a sample labeling and reviewing device provided by an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
  • embodiments of the present invention provide a sample labeling and reviewing method, device, electronic equipment, and computer-readable storage medium.
  • sample labeling and reviewing method of the embodiment of the present invention can be applied to the sample labeling and reviewing device of the embodiment of the present invention, and the sample labeling and reviewing device can be configured on an electronic device.
  • the electronic device may be a personal computer, a mobile terminal, etc.
  • the mobile terminal may be a hardware device with various operating systems such as a mobile phone or a tablet computer.
  • FIG. 1 is a schematic flowchart of a sample labeling and reviewing method provided by an embodiment of the present invention. Please refer to Figure 1.
  • a sample labeling review method can include the following steps:
  • Step S101 Obtain samples to be labeled.
  • Step S102 Recognizing at least one region of the sample to be labeled through a region recognition model, and cutting the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model.
  • the sample to be labeled may include various different types of picture samples, such as test papers, photos of animals and plants, scenic spots, vehicles, human faces, human body or part of human body constituents, objects, bills, etc., taking the test paper as an example, the area recognition The model will identify the area of each topic on the test paper and segment each area to form a sample image of the area. Then in step S103, a character recognition model is used to identify the character content of the sample pictures in each region and perform pre-labeling processing.
  • Step S103 Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model.
  • the preset recognition model can be selected according to the type of the sample area picture and its annotation type. For example, if the sample area picture is a plant image, and the type of plant in the picture needs to be marked, the preset The recognition model may be a recognition model for recognizing plant species. After the recognition model recognizes the plant image, the recognition result is pre-labeled the plant image. For example, the plant image is a peach blossom image, and if the recognition model recognizes a peach blossom, the recognition result of the plant image is pre-marked as a peach blossom.
  • Step S104 Send the pre-labeled sample area picture to the review unit, so that the review unit can review the pre-labeled result of the sample area picture, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified.
  • the review unit After the review unit receives the pre-labeled sample area picture, it can recognize the sample area picture, and judge whether the pre-labeled result of the sample area picture is correct according to its own recognition result, and if it is wrong, the pre-labeled result Modified to its own recognition result. For example, if the pre-annotation result of a plant picture is peach blossom, if the recognition result of the plant picture by the review unit is pear blossom, it means that the pre-annotation result is wrong, and the pre-label result is modified to be its own recognition result pear blossom .
  • Step S105 Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • the verification client is used to verify the sample area image reviewed by the verification unit to verify whether the label information reviewed by the verification unit is correct, so as to further ensure the sample area The labeling accuracy of the picture. After pre-labeling, reviewing, and testing the sample area pictures, accurate labeling information can be obtained.
  • one or more of the preset recognition models may be used to identify and pre-label the sample area pictures, and the review unit in step S104 may be one or more manual clients , Or a combination of manual client and preset recognition model.
  • Different preset recognition models are different recognition models established based on different training samples, so the recognition results and accuracy rates of each preset recognition model may be different.
  • pre-labeling processing is performed through one of the preset recognition models, and then a manual client is used as the review unit to perform pre-labeling and review processing on the sample area pictures.
  • step S103 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • a preset recognition model is used to recognize each sample area image and perform pre-labeling processing.
  • Step S104 sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • step S105 the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  • the sample area picture is pre-labeled by one of the preset recognition models, and then the pre-labeled result of the preset recognition model is reviewed by a manual client. If the manual client judges that the pre-labeling result is wrong, the pre-labeling result is modified.
  • the verification client performs verification processing on the verification result of the manual client. If the verification client determines that the annotation information after verification by the manual client is accurate, the sample area The picture completes the process of labeling and reviewing. If it is inaccurate, the process of identifying, labeling and reviewing the pictures of the sample area will be performed.
  • a recognition model for recognizing plant species For example, for a plant image, first recognize it through a recognition model for recognizing plant species. If the recognition result is A, perform pre-labeling processing to obtain pre-labeled result A; then, pre-labeled result A is obtained through an artificial client The results are reviewed. If the recognition result of the manual client is B, the pre-labeled result A is modified to B; finally, the verification client verifies whether the labeled information after the manual client’s review is accurate, The verification client recognizes the plant image. If the recognition result is B, it means that the current labeling information of the plant image is accurate. If the recognition result is not B, it means that the current labeling information of the plant image is inaccurate, and the plant image needs to be re-identified. . Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
  • pre-labeling processing is performed through one of the preset recognition models, and two manual clients are used as review units to perform pre-labeling and review processing on the sample area pictures.
  • verification client verifies whether the review results of the two manual clients are consistent, and if they are inconsistent, re-identify.
  • step S103 recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
  • step S104 the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • step S105 the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client checks whether the label information of the sample area picture after the two manual clients review is consistent, and if they are inconsistent, the sample area picture is re-identified.
  • the sample area picture is pre-labeled through one of the preset recognition models, and then the pre-labeled results of this preset recognition model are reviewed by two manual clients. If the client judges that the pre-labeling result is wrong, it modifies the pre-labeling result.
  • the verification client performs verification processing on the review results of the two manual clients. If the verification client determines that the marked information after the verification of the two manual clients is consistent, the sample area The picture completes the labeling review process, if inconsistent, the sample area pictures will be re-identified, labeled and reviewed.
  • a plant image first perform recognition and pre-annotation processing through a recognition model for identifying plant species to obtain the pre-annotation result of the plant image; then, send the pre-annotated plant image to two A manual client, each manual client reviews the pre-labeled results of the plant image to determine whether the pre-labeled results are correct, if not correct, modify it to its own recognition results; finally, the two manual clients
  • the verified plant image is sent to the verification client, and the verification client judges whether the labeling information after the two manual clients are the same. If they are consistent, it means that the current labeling information of the plant image is accurate; if they are inconsistent, it means that the plant image is current If the labeling information of is not accurate, the plant image needs to be re-identified.
  • pre-labeling processing is performed through a plurality of the preset recognition models, and a plurality of manual clients are used as review units to perform pre-labeling and review processing on the sample area pictures.
  • step S103 recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
  • step S104 the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the sample area pictures pre-labeled through different preset recognition models are sent to different manual clients at the same time, so that the manual client can review the pre-labeled results of the sample area pictures.
  • step S105 the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  • the sample area pictures are pre-labeled through different preset recognition models (for example, two), so that one sample area picture can generate multiple pre-labeled samples accordingly, and then These multiple pre-labeled samples are sent to different artificial clients (for example, two) at the same time, and each of the artificial clients reviews the multiple pre-labeled samples. If the artificial client judges that the pre-labeled results are incorrect , Modify the pre-labeled result.
  • the verification client performs verification processing on the review results of each of the manual clients. For each sample area picture, if the verification client determines the labeling information reviewed by different manual clients If they are consistent, the image of the sample area completes the process of labeling and reviewing. If they are inconsistent, the process of identifying, labeling, and reviewing the image of the sample area is performed again.
  • a plant image For example, for a plant image, firstly use two recognition models 1 and 2 for identifying plant species to perform recognition and pre-annotation processing respectively to obtain two pre-annotated plant images.
  • the pre-annotation result of one plant image is
  • the recognition result of recognition model 1 and the pre-annotation result of another plant image are the recognition results of recognition model 2.
  • the two pre-annotated plant images are sent to two artificial clients at the same time, and each artificial client is Two pre-annotated plant images are reviewed to determine whether the recognition result of recognition model 1 is correct and whether the recognition result of recognition model 2 is correct.
  • each artificial customer The plant image after verification by the client is sent to the verification client, and the verification client verifies whether the annotation information after the verification by the two manual clients is consistent. If they are consistent, the current annotation information of the plant image is accurate, and if they are inconsistent, it represents the plant image. If the current labeling information is not accurate, the plant image needs to be re-identified. Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
  • pre-labeling processing is performed through one of the preset recognition models, and one of the manual clients and the other preset recognition model are used as review units to pre-label the sample area pictures And audit processing.
  • step S103 recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
  • step S104 the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • step S105 the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
  • the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
  • the sample area picture is pre-labeled through a preset recognition model, and then this preset recognition model is reviewed through a manual client and another preset recognition model If the manual client determines that the pre-labeling result is incorrect, the pre-labeling result is modified, and the other preset recognition model is to identify and label the sample area picture.
  • the verification client performs verification processing on the review results of the manual client and another preset recognition model. For each sample area picture, if the verification client determines If the labeling information after the manual client review is consistent with the recognition result of another preset recognition model, the sample area picture completes the labeling review process, and if it is inaccurate, the sample area picture is re-identified, marked and reviewed Process.
  • the artificial client terminal reviews the pre-annotation result. If the artificial client's recognition result of the plant image is B, then the pre-annotation result of the plant image is modified to B, and another recognition model Recognize and label the plant image to get the labeling result C; finally, the verification client verifies whether the labeling information B after the manual client’s review is consistent with the labeling result C of another recognition model.
  • the plant image is currently If they are consistent, the plant image is currently If the labeling information is not accurate, it means that the current labeling information of the plant image is not accurate, and the plant image needs to be re-identified. Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
  • This embodiment introduces the pre-labeling, reviewing, and verification procedures of samples through the above three implementation methods, but the technical solution of the present invention is not limited to this.
  • the method further includes:
  • a preset number of pictures are selected from the sample area pictures recognized by the preset recognition model, and the pre-labeled results of the selected preset number of pictures are modified into recognition results different from the original pre-labeled results.
  • step S104 the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • step 5 After the verification client in step 5 performs verification processing on the label information of the sample area picture reviewed by the review unit, it also includes:
  • the verification client checks whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
  • a preset number of pictures can be randomly selected from all sample area pictures, and the pre-labeled results of the extracted pictures can be modified to different recognition results. Since this embodiment calculates the review status of the false pre-annotation results of the extracted preset number of pictures by the review unit, infers the review status of the review unit on all sample area pictures, and then determines the review unit Whether it is in an abnormal state, therefore, in order to ensure the accuracy of subsequent statistics, the following requirements may be imposed on the number of pictures to be extracted:
  • the preset number is greater than or equal to the minimum sampling number N for sampling statistics
  • N Z 2 ⁇ (P ⁇ (1-P))/E 2 ;
  • Z represents the statistics related to the confidence level, which is equal to the recognition accuracy rate of the current preset recognition model;
  • E represents the preset sampling error Value;
  • P represents the labeling accuracy of the sample area image after being labelled by the current preset recognition model.
  • the confidence interval of this embodiment is 90%-99.99%, that is to say, it is believed that the recognition accuracy of the current preset recognition model should fall within the range of P from 90%-99.99%, and 95% can be used in this embodiment. Confidence level.
  • the sampling error value E can be set between ⁇ 5%, and P is the probability value, which can be set to 90%, that is, the labeling accuracy of the sample area image after labeling by the current preset recognition model needs to reach 90%. If the minimum sample extraction number N is equal to 100 calculated by the above calculation formula, the preset number can be set to any value greater than or equal to 100.
  • the review unit reviews the pre-labeled results of the sample area pictures that have not modified the pre-labeled results and the sample area pictures that have modified the pre-labeled results.
  • the review process includes: for each sample area picture, the The review unit judges whether the marked pre-marked result is correct; if it is incorrect, the pre-marked result marked on the sample area picture can be modified. For example, if the pre-labeled result of a certain picture is female, and the review unit determines that the pre-labeled result of the sample is wrong after the review, and after its own recognition, it is determined that the recognition result of the picture should be male, then The pre-labeling result of the picture can be modified to the recognition result determined by itself.
  • the verification unit may not recognize that the pre-labeling result of the picture is wrong, which causes the verification unit to determine that the recognition result of the picture is correct.
  • the review by the review unit of the sample area pictures for deliberately modifying the pre-annotated results reflects the recognition (annotation) status of all the sample area pictures by the review unit, and then checks the review status of such picture samples by the review unit It is possible to infer the labeling accuracy or the audit accuracy of the audit unit, and determine whether the audit unit is abnormal.
  • the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  • the review unit can modify the pre-labeled result of a picture whose pre-labeled result has been modified, it can be considered that the review unit can correctly label the picture. If the proportion of pictures whose pre-annotation results have been modified by the review unit is greater than or equal to the preset threshold, it can be considered that there is no abnormality in the review unit, otherwise, it means that the review unit has abnormal.
  • the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
  • the proportion of the preset number of pictures for which the pre-annotated result has been modified is modified to the correct recognition result by the review unit, the proportion of the pictures that are modified as the correct recognition result is greater than or equal to the preset threshold, it can be considered that the review unit has no abnormality, Otherwise, it means that the audit unit is abnormal.
  • the proportion of the preset number of pictures of the modified pre-annotated results that are modified to the correct recognition result by the review unit it is used to determine whether the review unit is abnormal, and to characterize the review unit
  • the labeling accuracy rate is more accurate than the previous implementation.
  • the review unit makes corrections to make the marking accuracy rate meet the requirements.
  • the preset threshold can be set to any value equal to or greater than X, which is not limited in this embodiment.
  • this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
  • the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
  • FIG. 2 is a schematic structural diagram of a sample labeling and reviewing device provided by an embodiment of the present invention. Please refer to Figure 2.
  • a sample labeling review device may include:
  • the obtaining module 201 is used to obtain samples to be labeled
  • the recognition module 202 is configured to recognize at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model ;
  • the pre-labeling module 203 is configured to recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model;
  • the review module 204 is configured to send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures. Revise the pre-marked results;
  • the verification module 205 is configured to send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
  • the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
  • the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the pre-labeled sample area picture is sent to a manual client, so that the manual client can review the pre-labeled result of the sample area picture.
  • the verification process performed by the verification client in the verification module 205 on the annotation information of the sample area picture after being reviewed by the verification unit includes:
  • the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  • the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the sample area pictures pre-labeled through different preset recognition models are sent to different manual clients at the same time, so that the manual client can review the pre-labeled results of the sample area pictures.
  • the verification processing performed by the verification client in the verification module 205 on the annotation information of the sample area picture after the verification unit has reviewed includes:
  • the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  • the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
  • the review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets.
  • the verification processing performed by the verification client in the verification module 205 on the annotation information of the sample area picture after the verification unit has reviewed includes:
  • the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
  • pre-labeling module 203 After the pre-labeling module 203 performs pre-labeling processing, it is further used to:
  • the review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
  • the verification client in the verification module 205 After the verification client in the verification module 205 performs verification processing on the annotation information of the sample area picture reviewed by the verification unit, it is further used to:
  • the verification client checks whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
  • the review unit in the review module 204 reviews the pre-labeled results of the sample area pictures of the unmodified pre-labeled results and the sample area pictures of the modified pre-labeled results, including:
  • the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified.
  • the verification in the verification module 205 that the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result including:
  • the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  • the verification in the verification module 205 that the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result including:
  • FIG. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • an electronic device includes a processor 301, a communication interface 302, a memory 303, and a communication bus 304.
  • the processor 301, the communication interface 302, and the memory 303 communicate with each other through the communication bus 304.
  • the memory 303 is used to store computer programs
  • the processor 301 is configured to implement the following steps when executing the program stored in the memory 303:
  • Step 1 Obtain samples to be labeled
  • Step 2 Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
  • Step 3 Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
  • Step 4 Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
  • Step 5 Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the process of pre-labeling, review, and verification, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
  • the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
  • the electronic device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the aforementioned electronic device and other devices.
  • the memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage.
  • NVM non-Volatile Memory
  • the memory may also be at least one storage device located far away from the foregoing processor.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (DSP), a dedicated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc.
  • the processor is the control center of the electronic device, and various interfaces and lines are used to connect various parts of the entire electronic device.
  • An embodiment of the present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the following steps can be implemented:
  • Step 1 Obtain samples to be labeled
  • Step 2 Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein, the region recognition model is a neural network-based model;
  • Step 3 Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
  • Step 4 Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
  • Step 5 Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  • sample labeling and reviewing method implemented when the computer program is executed by the processor are the same as the sample labeling and reviewing method mentioned in the foregoing method section, and will not be repeated here.
  • this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
  • the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device, such as, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or Any suitable combination of the above.
  • Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory flash memory
  • SRAM static random access memory
  • CD-ROM compact disk read-only memory
  • DVD digital versatile disk
  • memory stick floppy disk
  • mechanical encoding device such as a printer with instructions stored thereon
  • the computer program described here can be downloaded from a computer-readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives the computer program from the network, and forwards the computer program for storage in a computer-readable storage medium in each computing/processing device.
  • the computer program used to perform the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or any of one or more programming languages.
  • Source code or object code written in combination, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages.
  • the computer program may be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server .
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access connection).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer program.
  • FPGA field programmable gate array
  • PDA programmable logic array
  • each block of the flowchart and/or block diagram and the combination of each block in the flowchart and/or block diagram can be implemented by a computer program.
  • These computer programs can be provided to the processors of general-purpose computers, special-purpose computers, or other programmable data processing devices to produce a machine that, when executed by the processors of the computer or other programmable data processing devices, produces A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram. It is also possible to store these computer programs in a readable storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

A sample labeling checking method and device. The method comprises: acquiring a sample to be labeled (S101); identifying, by means of an area identification model, at least one area of the sample to be labeled, and cutting the at least one area to form at least one sample area picture (S102); identifying each sample area picture by means of a pre-set identification model and performing pre-labeling processing (S103); sending the pre-labeled sample area picture to a checking unit, such that the checking unit checks a pre-labeling result of the sample area picture, and if it is checked and found that the pre-labeling result is wrong, modifying the pre-labeling result (S104); and sending the sample area picture checked by the checking unit to a verification client, such that the verification client performs verification processing on labeling information of the sample area picture checked by the checking unit (S105). By means of the method and device, the accuracy of sample labeling can be improved.

Description

样本标注审核方法及装置Sample labeling review method and device 技术领域Technical field
本发明涉及人工智能技术领域,尤其涉及一种样本标注审核方法、装置、电子设备和计算机可读存储介质。The present invention relates to the field of artificial intelligence technology, in particular to a sample labeling and reviewing method, device, electronic equipment and computer readable storage medium.
背景技术Background technique
在人工智能领域,在进行模型训练之前,需要对训练样本进行标注。通常可以人工客户端或识别模型对训练样本进行标注,但是这无法保证样本的标注准确率。In the field of artificial intelligence, the training samples need to be annotated before model training. Usually, a manual client or a recognition model can be used to label the training samples, but this cannot guarantee the labeling accuracy of the samples.
发明内容Summary of the invention
本发明的目的在于提供一种样本标注审核方法、装置、电子设备和计算机可读存储介质,以提高样本标注的准确率。具体技术方案如下:The purpose of the present invention is to provide a sample labeling review method, device, electronic equipment and computer readable storage medium to improve the accuracy of sample labeling. The specific technical solutions are as follows:
第一方面,本发明提供了一种样本标注审核方法,所述方法包括:In the first aspect, the present invention provides a sample labeling review method, which includes:
步骤1:获取待标注样本;Step 1: Obtain samples to be labeled;
步骤2:通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;Step 2: Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
步骤3:通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;Step 3: Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
步骤4:将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;Step 4: Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
步骤5:将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。Step 5: Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
可选的,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人工客户端对样本区域图片的预标注结果进行审核。The pre-labeled sample area picture is sent to a manual client, so that the manual client can review the pre-labeled result of the sample area picture.
可选的,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
可选的,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将通过不同预设识别模型进行预标注处理的样本区域图片发送给多个不同的人工客户端中的每一个,以使每个人工客户端对每个样本区域图片的预标注结果进行审核。The sample area pictures pre-labeled by different preset recognition models are sent to each of a plurality of different manual clients, so that each manual client can review the pre-labeled results of each sample area picture.
可选的,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
可选的,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, step 3 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审 核单元对样本区域图片的预标注结果进行审核,包括:Step 4 Send the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;Recognize and label the pre-labeled sample area pictures through another preset recognition model;
其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型。Wherein, the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets.
可选的,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification client in step 5 performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
可选的,步骤3进行预标注处理之后,还包括:Optionally, after the pre-labeling processing in step 3, it also includes:
从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果;Selecting a preset number of pictures from the sample area pictures recognized by the preset recognition model, and modifying the pre-labeled result of the selected preset number of pictures to a recognition result different from the original pre-labeled result;
步骤4中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step 4, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核;Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result Review the pre-annotated results of regional pictures;
步骤5中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还包括:After the verification client in step 5 performs verification processing on the label information of the sample area picture reviewed by the review unit, it also includes:
所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,判定所述审核单元是否处于异常状态。The verification client determines whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
可选的,所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核,包括:Optionally, the review unit reviews the pre-labeled results of the sample area pictures of the unmodified pre-labeled results and the sample area pictures of the modified pre-labeled results, including:
针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正 确;如果否,则对该样本区域图片所标注的预标注结果进行修改。For each sample area picture, the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified.
可选的,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
可选的,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
第二方面,本发明还提供了一种样本标注审核装置,所述装置包括:In the second aspect, the present invention also provides a sample labeling and reviewing device, which includes:
获取模块,用于获取待标注样本;Obtaining module for obtaining samples to be labeled;
识别模块,用于通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;A recognition module, configured to recognize at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
预标注模块,用于通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;The pre-labeling module is used to identify each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model;
审核模块,用于将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;The review module is configured to send the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, if the pre-labeled result is verified as If it is wrong, modify the pre-marked result;
校验模块,用于将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。The verification module is configured to send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
可选的,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人工客户端对样本区域图片的预标注结果进行审核;Sending the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture;
所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
可选的,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
将通过不同预设识别模型进行预标注处理的样本区域图片发送给多个不同的人工客户端中的每一个,以使每个人工客户端对每个样本区域图片的预标注结果进行审核;Send the sample area pictures pre-labeled through different preset recognition models to each of multiple different manual clients, so that each manual client can review the pre-labeled results of each sample area picture;
所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
可选的,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;Recognize and label the pre-labeled sample area pictures through another preset recognition model;
其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型;Wherein, the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets;
所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
可选的,所述预标注模块进行预标注处理之后,还用于:Optionally, after the pre-labeling module performs pre-labeling processing, it is also used to:
从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果;Selecting a preset number of pictures from the sample area pictures recognized by the preset recognition model, and modifying the pre-labeled result of the selected preset number of pictures to a recognition result different from the original pre-labeled result;
将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Sending the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核,包括:Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result The pre-annotated results of regional pictures are reviewed, including:
针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正确;如果否,则对该样本区域图片所标注的预标注结果进行修改;For each sample area picture, the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified;
所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还用于:After the verification client performs verification processing on the annotation information of the sample area picture that has been reviewed by the review unit, it is also used to:
根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态。According to the review result of the review unit on the preset number of pictures for which the pre-marked result has been modified, it is checked whether the review unit is in an abnormal state.
可选的,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification client terminal checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
可选的,所述检验模块中校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification client in the verification module checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
第三方面,本发明还提供了一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;In a third aspect, the present invention also provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus. Communication
所述存储器,用于存放计算机程序;The memory is used to store computer programs;
所述处理器,用于执行所述存储器上所存放的所述计算机程序时,实现上述第一方面所述的样本标注审核方法。The processor is configured to implement the sample labeling review method described in the first aspect when executing the computer program stored on the memory.
第四方面,本发明还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面述所述的样本标注审核方法。In a fourth aspect, the present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the sample labeling described in the first aspect is implemented. Audit method.
与现有技术相比,本发明提供的一种样本标注审核方法、装置、电子设备和计算机可读存储介质具有如下有益效果:Compared with the prior art, the sample labeling and reviewing method, device, electronic equipment and computer-readable storage medium provided by the present invention have the following beneficial effects:
本发明首先通过所述区域识别模型识别所述待标注样本的至少一个区域,并切割形成至少要给样本区域图片,然后通过所述预设识别模型对每个样本区域图片进行预标注处理,再通过所述审核单元对所述样本区域图片的预标注结果进行审核,如果审核出预标注结果错误则对预标注结果进行修改,最后通过所述校验客户端对经过所述审核单元审核后的样本区域图片的标注信息进行校验处理。可见,本发明按照预标注、审核、校验的流程来对样本进行标注,可以保证样本标注的准确率,进而提高模型训练的准确度。The present invention first recognizes at least one region of the sample to be labeled through the region recognition model, and cuts to form at least a sample region picture, and then pre-labels each sample region picture through the preset recognition model, and then The pre-annotation results of the sample area pictures are reviewed by the review unit. If the pre-annotation results are found to be incorrect, the pre-annotation results will be modified. Finally, the verification client will verify the results of the pre-marking The annotation information of the pictures in the sample area is verified. It can be seen that the present invention labels samples according to the process of pre-labeling, review, and verification, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是本发明一实施例提供的样本标注审核方法的流程示意图;FIG. 1 is a schematic flowchart of a sample labeling and reviewing method provided by an embodiment of the present invention;
图2是本发明一实施例提供的样本标注审核装置的结构示意图;2 is a schematic structural diagram of a sample labeling and reviewing device provided by an embodiment of the present invention;
图3是本发明一实施例提供的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.
具体实施方式Detailed ways
以下结合附图和具体实施例对本发明提出的一种样本标注审核方法、装置、电子设备及计算机可读存储介质作进一步详细说明。根据权利要求书和下面说明,本发明的优点和特征将更清楚。需要说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。此外,需要说明的是,本文的框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机程序指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方 式实现以及通过软件和硬件结合的方式实现都是等价的。Hereinafter, the method, device, electronic equipment, and computer-readable storage medium for sample labeling and review proposed by the present invention will be further described in detail with reference to the drawings and specific embodiments. According to the claims and the following description, the advantages and features of the present invention will be clearer. It should be noted that the drawings all adopt a very simplified form and all use imprecise proportions, which are only used to conveniently and clearly assist in explaining the purpose of the embodiments of the present invention. In addition, it should be noted that each block in the block diagrams and/or flowcharts herein, as well as the combination of blocks in the block diagrams and/or flowcharts, can be used as dedicated hardware-based The system can be implemented, or can be implemented by a combination of dedicated hardware and computer program instructions. It is well known to those skilled in the art that implementation through hardware, implementation through software, and implementation through a combination of software and hardware are all equivalent.
为解决现有技术的问题,本发明实施例提供了一种样本标注审核方法、装置、电子设备及计算机可读存储介质。In order to solve the problems of the prior art, embodiments of the present invention provide a sample labeling and reviewing method, device, electronic equipment, and computer-readable storage medium.
需要说明的是,本发明实施例的样本标注审核方法可应用于本发明实施例的样本标注审核装置,该样本标注审核装置可被配置于电子设备上。其中,该电子设备可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。It should be noted that the sample labeling and reviewing method of the embodiment of the present invention can be applied to the sample labeling and reviewing device of the embodiment of the present invention, and the sample labeling and reviewing device can be configured on an electronic device. Wherein, the electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal may be a hardware device with various operating systems such as a mobile phone or a tablet computer.
图1是本发明一实施例提供的一种样本标注审核方法的流程示意图。请参考图1,一种样本标注审核方法可以包括如下步骤:Fig. 1 is a schematic flowchart of a sample labeling and reviewing method provided by an embodiment of the present invention. Please refer to Figure 1. A sample labeling review method can include the following steps:
步骤S101,获取待标注样本。Step S101: Obtain samples to be labeled.
步骤S102,通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型。Step S102: Recognizing at least one region of the sample to be labeled through a region recognition model, and cutting the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model.
所述待标注样本可以包括各种不同类型的图片样本,例如试卷、动植物照片、景点、车辆、人脸、人体或部分人体组成物质、物品、票据等,以试卷为例,所述区域识别模型会识别试卷上各个题目的区域,对各个区域进行切分形成区域样本图片。然后在步骤S103中,利用字符识别模型识别各个区域样本图片的字符内容并进行预标注处理。The sample to be labeled may include various different types of picture samples, such as test papers, photos of animals and plants, scenic spots, vehicles, human faces, human body or part of human body constituents, objects, bills, etc., taking the test paper as an example, the area recognition The model will identify the area of each topic on the test paper and segment each area to form a sample image of the area. Then in step S103, a character recognition model is used to identify the character content of the sample pictures in each region and perform pre-labeling processing.
步骤S103,通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型。Step S103: Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model.
本实施例中,所述预设识别模型可以根据样本区域图片的类型及其标注类型进行选择,例如若所述样本区域图片为植物图像,并且需要标注图片中植物的种类,则所述预设识别模型可以为用于识别植物种类的识别模型。识别模型对植物图像进行识别后,将识别结果对植物图像进行预标注。例如植物图像为桃花图像,若识别模型识别为桃花,则将该植物图像的识别结果预标注为桃花。In this embodiment, the preset recognition model can be selected according to the type of the sample area picture and its annotation type. For example, if the sample area picture is a plant image, and the type of plant in the picture needs to be marked, the preset The recognition model may be a recognition model for recognizing plant species. After the recognition model recognizes the plant image, the recognition result is pre-labeled the plant image. For example, the plant image is a peach blossom image, and if the recognition model recognizes a peach blossom, the recognition result of the plant image is pre-marked as a peach blossom.
步骤S104,将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注 结果为错误则对所述预标注结果进行修改。Step S104: Send the pre-labeled sample area picture to the review unit, so that the review unit can review the pre-labeled result of the sample area picture, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified.
所述审核单元接收到经过预标注处理的样本区域图片后,可以对样本区域图片进行识别,并根据自身的识别结果判断所述样本区域图片的预标注结果是否正确,如果错误则将预标注结果修改为自身的识别结果。例如,一植物图片的预标注结果为桃花,若所述审核单元对该植物图片的识别结果为梨花,则表示预标注结果为错误,则对预标注结果进行修改,修改为自身的识别结果梨花。After the review unit receives the pre-labeled sample area picture, it can recognize the sample area picture, and judge whether the pre-labeled result of the sample area picture is correct according to its own recognition result, and if it is wrong, the pre-labeled result Modified to its own recognition result. For example, if the pre-annotation result of a plant picture is peach blossom, if the recognition result of the plant picture by the review unit is pear blossom, it means that the pre-annotation result is wrong, and the pre-label result is modified to be its own recognition result pear blossom .
步骤S105,将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。Step S105: Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
本实施例中,在审核单元之后,利用所述校验客户端再对所述审核单元审核的样本区域图片进行检验,检验经所述审核单元审核后的标注信息是否正确,以进一步保证样本区域图片的标注准确性。样本区域图片经过预标注、审核、检验流程,可以得到准确的标注信息。In this embodiment, after the verification unit, the verification client is used to verify the sample area image reviewed by the verification unit to verify whether the label information reviewed by the verification unit is correct, so as to further ensure the sample area The labeling accuracy of the picture. After pre-labeling, reviewing, and testing the sample area pictures, accurate labeling information can be obtained.
在实际应用中,步骤S103中可以通过一个或多个所述预设识别模型对所述样本区域图片进行识别和预标注处理,步骤S104中的所述审核单元可以为一个或多个人工客户端,或者人工客户端和预设识别模型的组合。不同的所述预设识别模型是根据不同的训练样本建立的不同识别模型,因此各个所述预设识别模型的识别结果以及准确率可能会有不同。In practical applications, in step S103, one or more of the preset recognition models may be used to identify and pre-label the sample area pictures, and the review unit in step S104 may be one or more manual clients , Or a combination of manual client and preset recognition model. Different preset recognition models are different recognition models established based on different training samples, so the recognition results and accuracy rates of each preset recognition model may be different.
在一种实现方式中,通过一个所述预设识别模型进行预标注处理,再通过一个人工客户端作为审核单元,对所述样本区域图片进行预标注和审核处理。In an implementation manner, pre-labeling processing is performed through one of the preset recognition models, and then a manual client is used as the review unit to perform pre-labeling and review processing on the sample area pictures.
具体的,步骤S103通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Specifically, step S103 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理。A preset recognition model is used to recognize each sample area image and perform pre-labeling processing.
步骤S104将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step S104 sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人 工客户端对样本区域图片的预标注结果进行审核。Send the pre-labeled sample area pictures to a manual client, so that the artificial client can review the pre-labeled results of the sample area pictures.
在步骤S105中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:In step S105, the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
在这种实现方式中,首先通过一个所述预设识别模型对所述样本区域图片进行预标注,然后通过一个所述人工客户端审核这一个所述预设识别模型的预标注结果,如果所述人工客户端判断预标注结果有误,则对预标注结果进行修改。In this implementation manner, the sample area picture is pre-labeled by one of the preset recognition models, and then the pre-labeled result of the preset recognition model is reviewed by a manual client. If the manual client judges that the pre-labeling result is wrong, the pre-labeling result is modified.
进一步的,再由所述校验客户端对所述人工客户端的审核结果进行校验处理,如果所述校验客户端判断所述人工客户端审核后的标注信息是准确的,则该样本区域图片完成标注审核处理流程,若不准确则重新对该样本区域图片进行识别、标注和审核处理流程。Further, the verification client performs verification processing on the verification result of the manual client. If the verification client determines that the annotation information after verification by the manual client is accurate, the sample area The picture completes the process of labeling and reviewing. If it is inaccurate, the process of identifying, labeling and reviewing the pictures of the sample area will be performed.
举例而言,对于一植物图像,首先通过一个用于识别植物种类的识别模型进行识别,如果识别结果为A,则进行预标注处理得到预标注结果A;然后,通过一人工客户端对预标注结果进行审核,如果所述人工客户端的识别结果为B,则将预标注结果A修改为B;最后,通过所述校验客户端检验所述人工客户端审核后的标注信息是否准确,所述校验客户端对植物图像进行识别,若识别结果为B,表示植物图像当前的标注信息准确,若识别结果不为B,表示植物图像当前的标注信息不准确,则需要对植物图像重新进行识别。通过上述的标注审核流程,可以提高植物图像的标注准确率。For example, for a plant image, first recognize it through a recognition model for recognizing plant species. If the recognition result is A, perform pre-labeling processing to obtain pre-labeled result A; then, pre-labeled result A is obtained through an artificial client The results are reviewed. If the recognition result of the manual client is B, the pre-labeled result A is modified to B; finally, the verification client verifies whether the labeled information after the manual client’s review is accurate, The verification client recognizes the plant image. If the recognition result is B, it means that the current labeling information of the plant image is accurate. If the recognition result is not B, it means that the current labeling information of the plant image is inaccurate, and the plant image needs to be re-identified. . Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
在另一种实现方式中,通过一个所述预设识别模型进行预标注处理,再通过两个人工客户端作为审核单元,对样本区域图片进行预标注和审核处理。最后由校验客户端校验两个人工客户端的审核结果是否一致,如果不一致则重新进行识别。In another implementation manner, pre-labeling processing is performed through one of the preset recognition models, and two manual clients are used as review units to perform pre-labeling and review processing on the sample area pictures. Finally, the verification client verifies whether the review results of the two manual clients are consistent, and if they are inconsistent, re-identify.
具体的,步骤S103中通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Specifically, in step S103, recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
步骤S104中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step S104, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片同时发送给两个人工客户端,以使两个人工客户端均对样本区域图片的预标注结果进行审核。Send the pre-labeled sample area pictures to two manual clients at the same time, so that both manual clients can review the pre-labeled results of the sample area pictures.
步骤S105中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:In step S105, the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,校验客户端检验经过两个人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the two manual clients review is consistent, and if they are inconsistent, the sample area picture is re-identified.
在这种实现方式中,首先通过一个所述预设识别模型对所述样本区域图片进行预标注,然后通过两个人工客户端均对这一个预设识别模型的预标注结果进行审核,如果人工客户端判断预标注结果有误,则对预标注结果进行修改。In this implementation, firstly, the sample area picture is pre-labeled through one of the preset recognition models, and then the pre-labeled results of this preset recognition model are reviewed by two manual clients. If the client judges that the pre-labeling result is wrong, it modifies the pre-labeling result.
进一步的,再由所述校验客户端对两个人工客户端的审核结果进行校验处理,如果所述校验客户端判断两个人工客户端审核后的标注信息是一致的,则该样本区域图片完成标注审核处理流程,若不一致则重新对该样本区域图片进行识别、标注和审核处理流程。Further, the verification client performs verification processing on the review results of the two manual clients. If the verification client determines that the marked information after the verification of the two manual clients is consistent, the sample area The picture completes the labeling review process, if inconsistent, the sample area pictures will be re-identified, labeled and reviewed.
举例而言,对于一植物图像,首先通过一个用于识别植物种类的识别模型进行识别和预标注处理,得到该植物图像的预标注结果;然后,将经过预标注的该植物图像同时发送给两个人工客户端,每个人工客户端均对该植物图像的预标注结果进行审核,判断预标注结果是否正确,如果不正确则将其修改为自身的识别结果;最后,将两个人工客户端审核后的植物图像发送给校验客户端,校验客户端判断经两个人工客户端审核后的标注信息是否一致,如果一致,表示植物图像当前的标注信息准确,如果不一致,表示植物图像当前的标注信息不准确,则需要对植物图像重新进行识别。通过上述的标注审核流程,可以提高植物图像的标注准确率。For example, for a plant image, first perform recognition and pre-annotation processing through a recognition model for identifying plant species to obtain the pre-annotation result of the plant image; then, send the pre-annotated plant image to two A manual client, each manual client reviews the pre-labeled results of the plant image to determine whether the pre-labeled results are correct, if not correct, modify it to its own recognition results; finally, the two manual clients The verified plant image is sent to the verification client, and the verification client judges whether the labeling information after the two manual clients are the same. If they are consistent, it means that the current labeling information of the plant image is accurate; if they are inconsistent, it means that the plant image is current If the labeling information of is not accurate, the plant image needs to be re-identified. Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
在再一种实现方式中,通过多个所述预设识别模型进行预标注处理,通 过多个所述人工客户端作为审核单元,对所述样本区域图片进行预标注和审核处理。In still another implementation manner, pre-labeling processing is performed through a plurality of the preset recognition models, and a plurality of manual clients are used as review units to perform pre-labeling and review processing on the sample area pictures.
具体的,步骤S103中通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Specifically, in step S103, recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
步骤S104中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step S104, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将通过不同预设识别模型进行预标注处理的样本区域图片同时发送给不同的人工客户端,以使人工客户端对样本区域图片的预标注结果进行审核。The sample area pictures pre-labeled through different preset recognition models are sent to different manual clients at the same time, so that the manual client can review the pre-labeled results of the sample area pictures.
步骤S105中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:In step S105, the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
在这种实现方式中,首先通过不同的所述预设识别模型(例如两个)对所述样本区域图片进行预标注,从而一个所述样本区域图片可以相应的产生多个预标注样本,然后这多个预标注样本同时发送给不同的所述人工客户端(例如两个),每个所述人工客户端均审核这多个预标注样本,如果所述人工客户端判断预标注结果有误,则对预标注结果进行修改。In this implementation manner, the sample area pictures are pre-labeled through different preset recognition models (for example, two), so that one sample area picture can generate multiple pre-labeled samples accordingly, and then These multiple pre-labeled samples are sent to different artificial clients (for example, two) at the same time, and each of the artificial clients reviews the multiple pre-labeled samples. If the artificial client judges that the pre-labeled results are incorrect , Modify the pre-labeled result.
进一步的,再由所述校验客户端对各个所述人工客户端的审核结果进行校验处理,针对每个样本区域图片,如果所述校验客户端判断经不同人工客户端审核后的标注信息一致,则该样本区域图片完成标注审核处理流程,若不一致则重新对该样本区域图片进行识别、标注和审核处理流程。Further, the verification client performs verification processing on the review results of each of the manual clients. For each sample area picture, if the verification client determines the labeling information reviewed by different manual clients If they are consistent, the image of the sample area completes the process of labeling and reviewing. If they are inconsistent, the process of identifying, labeling, and reviewing the image of the sample area is performed again.
举例而言,对于一植物图像,首先通过两个用于识别植物种类的识别模型1和2分别进行识别和预标注处理,得到两个预标注的植物图像,其中一个植物图像的预标注结果为识别模型1的识别结果,另一个植物图像的预标注结果为识别模型2的识别结果;然后,将两个预标注的植物图像同时发送 给两个人工客户端,每个人工客户端均对这两个预标注的植物图像进行审核,判断识别模型1的识别结果是否正确,以及识别模型2的识别结果是否正确,如果不正确则将其修改为自身的识别结果;最后,将每个人工客户端审核后的植物图像发送给校验客户端,校验客户端检验经两个人工客户端审核后的标注信息是否一致,如果一致,表示植物图像当前的标注信息准确,如果不一致,表示植物图像当前的标注信息不准确,则需要对植物图像重新进行识别。通过上述的标注审核流程,可以提高植物图像的标注准确率。For example, for a plant image, firstly use two recognition models 1 and 2 for identifying plant species to perform recognition and pre-annotation processing respectively to obtain two pre-annotated plant images. The pre-annotation result of one plant image is The recognition result of recognition model 1 and the pre-annotation result of another plant image are the recognition results of recognition model 2. Then, the two pre-annotated plant images are sent to two artificial clients at the same time, and each artificial client is Two pre-annotated plant images are reviewed to determine whether the recognition result of recognition model 1 is correct and whether the recognition result of recognition model 2 is correct. If it is incorrect, modify it to its own recognition result; finally, each artificial customer The plant image after verification by the client is sent to the verification client, and the verification client verifies whether the annotation information after the verification by the two manual clients is consistent. If they are consistent, the current annotation information of the plant image is accurate, and if they are inconsistent, it represents the plant image. If the current labeling information is not accurate, the plant image needs to be re-identified. Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
在又一种实现方式中,通过一个所述预设识别模型进行预标注处理,通过一个所述人工客户端和另一个所述预设识别模型作为审核单元,对所述样本区域图片进行预标注和审核处理。In yet another implementation manner, pre-labeling processing is performed through one of the preset recognition models, and one of the manual clients and the other preset recognition model are used as review units to pre-label the sample area pictures And audit processing.
具体的,步骤S103中通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Specifically, in step S103, recognizing each sample area picture through a preset recognition model and performing pre-labeling processing includes:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
步骤S104中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step S104, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型。Recognize and label the pre-labeled sample area pictures through another preset recognition model; wherein, the one preset recognition model and the another preset recognition model are trained and established based on different training sample sets Identify the model.
步骤S105中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:In step S105, the verification client performs verification processing on the annotation information of the sample area picture after the verification unit has reviewed, including:
针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
在这种实现方式中,首先通过一个所述预设识别模型对所述样本区域图片进行预标注,然后通过一个所述人工客户端和另一个所述预设识别模型审核这一个预设识别模型的预标注结果,如果所述人工客户端判断预标注结果 有误则对预标注结果进行修改,另一个所述预设识别模型则对样本区域图片进行识别和标注。In this implementation manner, the sample area picture is pre-labeled through a preset recognition model, and then this preset recognition model is reviewed through a manual client and another preset recognition model If the manual client determines that the pre-labeling result is incorrect, the pre-labeling result is modified, and the other preset recognition model is to identify and label the sample area picture.
进一步的,再由所述校验客户端对所述人工客户端和另一个预设识别模型的审核结果进行校验处理,针对每个所述样本区域图片,如果所述校验客户端判断经所述人工客户端审核后的标注信息与另一个预设识别模型的识别结果一致,则该样本区域图片完成标注审核处理流程,若不准确则重新对该样本区域图片进行识别、标注和审核处理流程。Further, the verification client performs verification processing on the review results of the manual client and another preset recognition model. For each sample area picture, if the verification client determines If the labeling information after the manual client review is consistent with the recognition result of another preset recognition model, the sample area picture completes the labeling review process, and if it is inaccurate, the sample area picture is re-identified, marked and reviewed Process.
举例而言,对于一植物图像,首先通过一个用于识别植物种类的识别模型进行识别和预标注处理,得到植物图像的预标注结果A;然后,将这个预标注的植物图像发送给一个人工客户端,所述人工客户端对预标注结果进行审核,如果所述人工客户端对该植物图像的识别结果为B,则将该植物图像的预标注结果修改为B,并且还通过另一个识别模型对该植物图像进行识别和标注,得到标注结果C;最后,通过校验客户端检验人工客户端审核后的标注信息B与另一个识别模型的标注结果C是否一致,如果一致,表示植物图像当前的标注信息准确,如果不一致,表示植物图像当前的标注信息不准确,则需要对植物图像重新进行识别。通过上述的标注审核流程,可以提高植物图像的标注准确率。For example, for a plant image, first perform recognition and pre-annotation processing through a recognition model for identifying plant species to obtain the pre-annotated result A of the plant image; then, send the pre-annotated plant image to an artificial client On the other hand, the artificial client terminal reviews the pre-annotation result. If the artificial client's recognition result of the plant image is B, then the pre-annotation result of the plant image is modified to B, and another recognition model Recognize and label the plant image to get the labeling result C; finally, the verification client verifies whether the labeling information B after the manual client’s review is consistent with the labeling result C of another recognition model. If they are consistent, the plant image is currently If the labeling information is not accurate, it means that the current labeling information of the plant image is not accurate, and the plant image needs to be re-identified. Through the above-mentioned labeling review process, the labeling accuracy of plant images can be improved.
本实施例通过上述三种实现方式介绍了样本的预标注、审核和校验流程,但本发明的技术方案并不仅仅限定于此。This embodiment introduces the pre-labeling, reviewing, and verification procedures of samples through the above three implementation methods, but the technical solution of the present invention is not limited to this.
进一步的,在进行上述预标注、审核和校验流程中,还可以检查所述审核单元是否处于异常状态,如果发生异常则对所述审核单元进行修正,以进一步保证样本的标注准确率。Further, during the above-mentioned pre-labeling, reviewing and verification process, it is also possible to check whether the reviewing unit is in an abnormal state, and if an abnormality occurs, the reviewing unit is corrected to further ensure the accuracy of the labeling of the sample.
具体的,在步骤S103进行预标注处理之后,所述方法还包括:Specifically, after the pre-labeling processing is performed in step S103, the method further includes:
从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果。A preset number of pictures are selected from the sample area pictures recognized by the preset recognition model, and the pre-labeled results of the selected preset number of pictures are modified into recognition results different from the original pre-labeled results.
步骤S104中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step S104, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核;Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result Review the pre-annotated results of regional pictures;
步骤5中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还包括:After the verification client in step 5 performs verification processing on the label information of the sample area picture reviewed by the review unit, it also includes:
所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态。The verification client checks whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
将图片的预标注结果修改为与所标注的原始预标注结果不一致或不同的识别结果,举例而言,若某一人像图片的预标注结果为女性,那么将该图片的预标注结果修改为原始预标注结果不同的识别结果,例如修改为男性。又如,若某一植物图片的预标注结果为桃花,那么将该植物图片的预标注结果修改为不一致或不同的识别结果,例如修改为梨花。Modify the pre-labeled result of the picture to be inconsistent or different from the original pre-labeled result. For example, if the pre-labeled result of a portrait picture is female, then the pre-labeled result of the picture is modified to the original Recognition results with different pre-labeled results, such as modification to male. For another example, if the pre-annotation result of a certain plant picture is peach blossom, then the pre-annotation result of the plant picture is modified to an inconsistent or different recognition result, such as pear blossom.
本实施例中,可以从所有样本区域图片中随机地抽取预设数量个图片,将抽取的图片的预标注结果修改为不同的识别结果。由于本实施例是通过统计所述审核单元对所抽取的预设数量个图片的错误预标注结果的审核情况,来推断所述审核单元对所有样本区域图片的审核情况,进而判断所述审核单元是否处于异常状态,因此,为了保证后续统计的准确性,对所抽取的图片的数量可以有如下要求:In this embodiment, a preset number of pictures can be randomly selected from all sample area pictures, and the pre-labeled results of the extracted pictures can be modified to different recognition results. Since this embodiment calculates the review status of the false pre-annotation results of the extracted preset number of pictures by the review unit, infers the review status of the review unit on all sample area pictures, and then determines the review unit Whether it is in an abnormal state, therefore, in order to ensure the accuracy of subsequent statistics, the following requirements may be imposed on the number of pictures to be extracted:
所述预设数量大于等于抽样统计的最小样本抽取数量N;The preset number is greater than or equal to the minimum sampling number N for sampling statistics;
其中,N=Z 2×(P×(1-P))/E 2;Z表示与置信度相关的统计量,置信度等于当前预设识别模型的识别准确率;E表示预设的抽样误差值;P表示样本区域图片经当前预设识别模型标注后的标注准确率。 Among them, N=Z 2 ×(P×(1-P))/E 2 ; Z represents the statistics related to the confidence level, which is equal to the recognition accuracy rate of the current preset recognition model; E represents the preset sampling error Value; P represents the labeling accuracy of the sample area image after being labelled by the current preset recognition model.
Z与置信度的对应关系如下:当置信度为90%时,Z=1.64;置信度为95%时,Z=1.96;置信度为95.45%时,Z=2;置信度为99%时,Z=2.68;置信度为99.73时,Z=3;以上数据可以通过查询统计表得出。本实施例的置信区间为90%~99.99%,也就是说认为当前预设识别模型的识别准确率应该在90%~99.99%的几率落入到P的范围内,本实施例可以采用95%的置信度。本实施例中,抽样误差值E可以设定在±5%之间,P为概率值,可以设定其为 90%,即经当前预设识别模型标注后样本区域图片的标注准确率需要达到90%。如果通过上述计算公式计算得到最小样本抽取数量N等于100,则所述预设数量可以设定大于等于100的任意数值。The corresponding relationship between Z and confidence is as follows: when the confidence is 90%, Z=1.64; when the confidence is 95%, Z=1.96; when the confidence is 95.45%, Z=2; when the confidence is 99%, Z=2.68; when the confidence level is 99.73, Z=3; the above data can be obtained by querying the statistical table. The confidence interval of this embodiment is 90%-99.99%, that is to say, it is believed that the recognition accuracy of the current preset recognition model should fall within the range of P from 90%-99.99%, and 95% can be used in this embodiment. Confidence level. In this embodiment, the sampling error value E can be set between ±5%, and P is the probability value, which can be set to 90%, that is, the labeling accuracy of the sample area image after labeling by the current preset recognition model needs to reach 90%. If the minimum sample extraction number N is equal to 100 calculated by the above calculation formula, the preset number can be set to any value greater than or equal to 100.
本实施例中,所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核处理,审核处理包括:针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正确;如果不正确,还可以对该样本区域图片所标注的预标注结果进行修改。举例而言,若某一图片的预标注结果为女性,而所述审核单元在审核后判定该样本的预标注结果有误,且经过自身的识别后判定该图片的识别结果应该为男性,则可以对该图片的预标注结果修改为自身所判定的识别结果。In this embodiment, the review unit reviews the pre-labeled results of the sample area pictures that have not modified the pre-labeled results and the sample area pictures that have modified the pre-labeled results. The review process includes: for each sample area picture, the The review unit judges whether the marked pre-marked result is correct; if it is incorrect, the pre-marked result marked on the sample area picture can be modified. For example, if the pre-labeled result of a certain picture is female, and the review unit determines that the pre-labeled result of the sample is wrong after the review, and after its own recognition, it is determined that the recognition result of the picture should be male, then The pre-labeling result of the picture can be modified to the recognition result determined by itself.
实际上,对于某一已修改预标注结果的样本区域图片,所述审核单元可能没有识别出该图片的预标注结果错误,从而导致所述审核单元将该图片的识别结果判定为正确。所述审核单元对故意修改预标注结果的样本区域图片的审核情况反应了所述审核单元对所有样本区域图片的识别(标注)情况,进而通过检查所述审核单元对这类图片样本的审核情况可以来推断所述审核单元的标注准确率或审核准确率,并且判断所述审核单元是否存在异常。In fact, for a sample area picture whose pre-labeling result has been modified, the verification unit may not recognize that the pre-labeling result of the picture is wrong, which causes the verification unit to determine that the recognition result of the picture is correct. The review by the review unit of the sample area pictures for deliberately modifying the pre-annotated results reflects the recognition (annotation) status of all the sample area pictures by the review unit, and then checks the review status of such picture samples by the review unit It is possible to infer the labeling accuracy or the audit accuracy of the audit unit, and determine whether the audit unit is abnormal.
在一种实现方式中,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:In an implementation manner, the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
可以理解的是,通常来说,如果所述审核单元能够对已修改预标注结果的图片的预标注结果进行修改,则可以认为所述审核单元能够将该图片进行正确的标注。若在已修改预标注结果的图片中被所述审核单元修改了预标注结果的图片的比例大于等于预设阈值,可以认为所述审核单元没有出现异常, 反之,则表示所述审核单元出现了异常。It can be understood that, generally speaking, if the review unit can modify the pre-labeled result of a picture whose pre-labeled result has been modified, it can be considered that the review unit can correctly label the picture. If the proportion of pictures whose pre-annotation results have been modified by the review unit is greater than or equal to the preset threshold, it can be considered that there is no abnormality in the review unit, otherwise, it means that the review unit has abnormal.
在另一种实现方式中,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:In another implementation manner, the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
在这一实现方式中,若已修改预标注结果的预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例大于等于预设阈值,可以认为所述审核单元没有出现异常,反之,则表示所述审核单元出现了异常。根据已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,用来判断所述审核单元是否存在异常,以及用来表征所述审核单元的标注准确率,相比于上一种实现方式更加准确。In this implementation manner, if the proportion of the preset number of pictures for which the pre-annotated result has been modified is modified to the correct recognition result by the review unit, the proportion of the pictures that are modified as the correct recognition result is greater than or equal to the preset threshold, it can be considered that the review unit has no abnormality, Otherwise, it means that the audit unit is abnormal. According to the proportion of the preset number of pictures of the modified pre-annotated results that are modified to the correct recognition result by the review unit, it is used to determine whether the review unit is abnormal, and to characterize the review unit The labeling accuracy rate is more accurate than the previous implementation.
当判断出第一比例小于预设阈值,或第二比例小于预设阈值时,可以判定所述审核单元处于异常状态,同时也表示所述审核单元的标注准确率达不到期望值,因此可以对所述审核单元进行修正,以使其标注准确率符合要求。When it is determined that the first ratio is less than the preset threshold, or the second ratio is less than the preset threshold, it can be determined that the audit unit is in an abnormal state, and it also means that the tagging accuracy of the audit unit does not reach the expected value, so The review unit makes corrections to make the marking accuracy rate meet the requirements.
其中,所述预设阈值的最小值X可以按照以下公式确定:1-(1-X) 2=Q;Q表示预先设置的经所述预设识别模型标注和所述审核单元审核后样本标注的目标准确率。所述预设阈值可以设置为任意的等于大于X的数值,本实施例对此不做限定。 Wherein, the minimum value X of the preset threshold value can be determined according to the following formula: 1-(1-X) 2 =Q; Q represents the preset identification model labeling and the sample labeling after the auditing unit has been reviewed The target accuracy rate. The preset threshold can be set to any value equal to or greater than X, which is not limited in this embodiment.
当Q=99%时,通过上述公式计算出X=90%,即所述审核单元的标注准确率需要达到90%以上。本实施例中,如果希望经所述预设识别模型标注和所述审核单元审核后样本标注的准确率达到99%以上,则所述审核单元在已修改预标注结果的预设数量个图片中的标注准确率需要达到90%以上。When Q=99%, X=90% is calculated by the above formula, that is, the labeling accuracy of the audit unit needs to reach 90% or more. In this embodiment, if it is desired that the accuracy of the sample labeling after the preset recognition model labeling and the auditing unit review is more than 99%, the auditing unit will list the pre-annotated results in the preset number of pictures. The labeling accuracy rate needs to be above 90%.
综上所述,本实施例首先通过所述区域识别模型识别所述待标注样本的至少一个区域,并切割形成至少要给样本区域图片,然后通过所述预设识别 模型对每个样本区域图片进行预标注处理,再通过所述审核单元对所述样本区域图片的预标注结果进行审核,如果审核出预标注结果错误则对预标注结果进行修改,最后通过所述校验客户端对经过所述审核单元审核后的样本区域图片的标注信息进行校验处理。可见,本实施例按照预标注、审核、校验的流程来对样本进行标注,可以保证样本标注的准确率,进而提高模型训练的准确度。进一步的,将一部分样本区域图片的预标注结果故意修改为错误识别结果,通过检查所述审核单元对故意标错的这一部分样本区域图片的审核结果即可推断所述审核单元的标注准确率,进而判断所述审核单元是否处于异常状态,实现了快速确定所述审核单元是否处于异常状态,并且缩短了统计时间,降低了费用成本。To sum up, this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training. Further, the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
相应于上述样本标注审核方法实施例,本发明一实施例还提供了一种样本标注审核装置,图2是本发明一实施例提供的一种样本标注审核装置的结构示意图。请参考图2,一种样本标注审核装置可以包括:Corresponding to the foregoing embodiment of the sample labeling and reviewing method, an embodiment of the present invention also provides a sample labeling and reviewing device. FIG. 2 is a schematic structural diagram of a sample labeling and reviewing device provided by an embodiment of the present invention. Please refer to Figure 2. A sample labeling review device may include:
获取模块201,用于获取待标注样本;The obtaining module 201 is used to obtain samples to be labeled;
识别模块202,用于通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;The recognition module 202 is configured to recognize at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model ;
预标注模块203,用于通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;The pre-labeling module 203 is configured to recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model;
审核模块204,用于将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;The review module 204 is configured to send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures. Revise the pre-marked results;
校验模块205,用于将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。The verification module 205 is configured to send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
综上所述,本实施例首先通过所述区域识别模型识别所述待标注样本的至少一个区域,并切割形成至少要给样本区域图片,然后通过所述预设识别模型对每个样本区域图片进行预标注处理,再通过所述审核单元对所述样本 区域图片的预标注结果进行审核,如果审核出预标注结果错误则对预标注结果进行修改,最后通过所述校验客户端对经过所述审核单元审核后的样本区域图片的标注信息进行校验处理。可见,本实施例按照预标注、审核、校验的流程来对样本进行标注,可以保证样本标注的准确率,进而提高模型训练的准确度。进一步的,将一部分样本区域图片的预标注结果故意修改为错误识别结果,通过检查所述审核单元对故意标错的这一部分样本区域图片的审核结果即可推断所述审核单元的标注准确率,进而判断所述审核单元是否处于异常状态,实现了快速确定所述审核单元是否处于异常状态,并且缩短了统计时间,降低了费用成本。To sum up, this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training. Further, the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
可选的,所述预标注模块203通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
所述审核模块204将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:The review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人工客户端对样本区域图片的预标注结果进行审核。The pre-labeled sample area picture is sent to a manual client, so that the manual client can review the pre-labeled result of the sample area picture.
可选的,所述校验模块205中校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification process performed by the verification client in the verification module 205 on the annotation information of the sample area picture after being reviewed by the verification unit includes:
针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
可选的,所述预标注模块203通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
所述审核模块204将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:The review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将通过不同预设识别模型进行预标注处理的样本区域图片同时发送给不 同的人工客户端,以使人工客户端对样本区域图片的预标注结果进行审核。The sample area pictures pre-labeled through different preset recognition models are sent to different manual clients at the same time, so that the manual client can review the pre-labeled results of the sample area pictures.
可选的,所述校验模块205中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification processing performed by the verification client in the verification module 205 on the annotation information of the sample area picture after the verification unit has reviewed includes:
针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
可选的,所述预标注模块203通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:Optionally, the pre-labeling module 203 recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, including:
通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
所述审核模块204将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:The review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;Recognize and label the pre-labeled sample area pictures through another preset recognition model;
其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型。Wherein, the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets.
可选的,所述校验模块205中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:Optionally, the verification processing performed by the verification client in the verification module 205 on the annotation information of the sample area picture after the verification unit has reviewed includes:
针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
可选的,所述预标注模块203进行预标注处理之后,还用于:Optionally, after the pre-labeling module 203 performs pre-labeling processing, it is further used to:
从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果;Selecting a preset number of pictures from the sample area pictures recognized by the preset recognition model, and modifying the pre-labeled result of the selected preset number of pictures to a recognition result different from the original pre-labeled result;
所述审核模块204中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:The review module 204 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核;Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result Review the pre-annotated results of regional pictures;
所述校验模块205中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还用于:After the verification client in the verification module 205 performs verification processing on the annotation information of the sample area picture reviewed by the verification unit, it is further used to:
所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态。The verification client checks whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
可选的,所述审核模块204中审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核,包括:Optionally, the review unit in the review module 204 reviews the pre-labeled results of the sample area pictures of the unmodified pre-labeled results and the sample area pictures of the modified pre-labeled results, including:
针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正确;如果否,则对该样本区域图片所标注的预标注结果进行修改。For each sample area picture, the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified.
可选的,所述校验模块205中校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification in the verification module 205 that the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
可选的,所述检验模块205中校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:Optionally, the verification in the verification module 205 that the verification client checks whether the verification unit is in an abnormal state according to the verification result of the verification unit on the preset number of pictures of the modified pre-annotation result, including:
针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
本发明一实施例还提供了一种电子设备,图3是本发明一实施例提供的 一种电子设备的结构示意图。请参考图3,一种电子设备包括处理器301、通信接口302、存储器303和通信总线304,其中,处理器301,通信接口302,存储器303通过通信总线304完成相互间的通信,An embodiment of the present invention also provides an electronic device. FIG. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. 3, an electronic device includes a processor 301, a communication interface 302, a memory 303, and a communication bus 304. The processor 301, the communication interface 302, and the memory 303 communicate with each other through the communication bus 304.
存储器303,用于存放计算机程序;The memory 303 is used to store computer programs;
处理器301,用于执行存储器303上所存放的程序时,实现如下步骤:The processor 301 is configured to implement the following steps when executing the program stored in the memory 303:
步骤1:获取待标注样本;Step 1: Obtain samples to be labeled;
步骤2:通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;Step 2: Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
步骤3:通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;Step 3: Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
步骤4:将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;Step 4: Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
步骤5:将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。Step 5: Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
关于该方法各个步骤的具体实现以及相关解释内容可以参见上述图1所示的方法实施例,在此不做赘述。For the specific implementation of each step of the method and related explanation content, please refer to the method embodiment shown in FIG. 1 above, which will not be repeated here.
另外,处理器301执行存储器303上所存放的程序而实现的样本标注审核方法的其他实现方式,与前述方法实施例部分所提及的实现方式相同,这里也不再赘述。In addition, other implementation manners of the sample labeling review method implemented by the processor 301 executing the program stored in the memory 303 are the same as the implementation manners mentioned in the foregoing method embodiment section, and will not be repeated here.
综上所述,本实施例首先通过所述区域识别模型识别所述待标注样本的至少一个区域,并切割形成至少要给样本区域图片,然后通过所述预设识别模型对每个样本区域图片进行预标注处理,再通过所述审核单元对所述样本区域图片的预标注结果进行审核,如果审核出预标注结果错误则对预标注结果进行修改,最后通过所述校验客户端对经过所述审核单元审核后的样本区域图片的标注信息进行校验处理。可见,本实施例按照预标注、审核、校验 的流程来对样本进行标注,可以保证样本标注的准确率,进而提高模型训练的准确度。进一步的,将一部分样本区域图片的预标注结果故意修改为错误识别结果,通过检查所述审核单元对故意标错的这一部分样本区域图片的审核结果即可推断所述审核单元的标注准确率,进而判断所述审核单元是否处于异常状态,实现了快速确定所述审核单元是否处于异常状态,并且缩短了统计时间,降低了费用成本。To sum up, this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the process of pre-labeling, review, and verification, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training. Further, the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
所述电子设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The electronic device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
通信接口用于上述电子设备与其他设备之间的通信。存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The communication interface is used for communication between the aforementioned electronic device and other devices. The memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage. Optionally, the memory may also be at least one storage device located far away from the foregoing processor.
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器是所述电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分。The above-mentioned processor can be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (DSP), a dedicated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc. The processor is the control center of the electronic device, and various interfaces and lines are used to connect various parts of the entire electronic device.
本发明一实施例还提供了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时能实现如下步骤:An embodiment of the present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the following steps can be implemented:
步骤1:获取待标注样本;Step 1: Obtain samples to be labeled;
步骤2:通过区域识别模型识别所述待标注样本的至少一个区域,并对所 述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;Step 2: Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein, the region recognition model is a neural network-based model;
步骤3:通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;Step 3: Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
步骤4:将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;Step 4: Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
步骤5:将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。Step 5: Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
需要说明的是,上述计算机程序被处理器执行时实现的样本标注审核方法的其他实施例,与前述方法部分提及的样本标注审核方法的实施例相同,在此不再赘述。It should be noted that other embodiments of the sample labeling and reviewing method implemented when the computer program is executed by the processor are the same as the sample labeling and reviewing method mentioned in the foregoing method section, and will not be repeated here.
综上所述,本实施例首先通过所述区域识别模型识别所述待标注样本的至少一个区域,并切割形成至少要给样本区域图片,然后通过所述预设识别模型对每个样本区域图片进行预标注处理,再通过所述审核单元对所述样本区域图片的预标注结果进行审核,如果审核出预标注结果错误则对预标注结果进行修改,最后通过所述校验客户端对经过所述审核单元审核后的样本区域图片的标注信息进行校验处理。可见,本实施例按照预标注、审核、校验的流程来对样本进行标注,可以保证样本标注的准确率,进而提高模型训练的准确度。进一步的,将一部分样本区域图片的预标注结果故意修改为错误识别结果,通过检查所述审核单元对故意标错的这一部分样本区域图片的审核结果即可推断所述审核单元的标注准确率,进而判断所述审核单元是否处于异常状态,实现了快速确定所述审核单元是否处于异常状态,并且缩短了统计时间,降低了费用成本。To sum up, this embodiment first recognizes at least one area of the sample to be labeled through the region recognition model, and cuts to form at least a picture of the sample area, and then uses the preset recognition model to identify each sample area picture Perform pre-labeling processing, and then review the pre-labeled results of the sample area pictures through the review unit. If the pre-labeled results are wrong, modify the pre-labeled results, and finally use the verification client to verify the results of the pre-labeling. The label information of the sample area picture after the review by the review unit is verified. It can be seen that, in this embodiment, the samples are labeled according to the pre-labeling, reviewing, and verification process, which can ensure the accuracy of sample labeling, thereby improving the accuracy of model training. Further, the pre-marked results of a part of the sample area pictures are deliberately modified to the wrong recognition results, and the marking accuracy rate of the review unit can be inferred by checking the review results of the part of the sample area pictures that are deliberately incorrectly marked by the review unit. Furthermore, it is determined whether the audit unit is in an abnormal state, which realizes the rapid determination of whether the audit unit is in an abnormal state, shortens the statistical time and reduces the cost.
所述计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备,例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可 读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所描述的计算机程序可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收所述计算机程序,并转发该计算机程序,以供存储在各个计算/处理设备中的计算机可读存储介质中。用于执行本发明操作的计算机程序可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。所述计算机程序可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施方式中,通过利用计算机程序的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device, such as, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or Any suitable combination of the above. More specific examples of computer-readable storage media (non-exhaustive list) include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) Or flash memory), static random access memory (SRAM), portable compact disk read-only memory (CD-ROM), digital versatile disk (DVD), memory stick, floppy disk, mechanical encoding device, such as a printer with instructions stored thereon The protruding structure in the hole card or the groove, and any suitable combination of the above. The computer program described here can be downloaded from a computer-readable storage medium to each computing/processing device, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer program from the network, and forwards the computer program for storage in a computer-readable storage medium in each computing/processing device. The computer program used to perform the operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or any of one or more programming languages. Source code or object code written in combination, the programming language includes object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as "C" language or similar programming languages. The computer program may be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server . In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to access connection). In some embodiments, an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), can be customized by using the status information of the computer program. Read the program instructions to implement various aspects of the present invention.
这里参照根据本发明实施方式的方法、系统和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机程序实现。这些计算机程序可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处 理器,从而生产出一种机器,使得这些程序在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机程序存储在可读存储介质中,这些计算机程序使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有该计算机程序的可读存储介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。也可以把计算机程序加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的计算机程序实现流程图和/或框图中的一个或多个方框中规定的功能/动作。The various aspects of the present invention are described herein with reference to flowcharts and/or block diagrams of methods, systems, and computer program products according to embodiments of the present invention. It should be understood that each block of the flowchart and/or block diagram and the combination of each block in the flowchart and/or block diagram can be implemented by a computer program. These computer programs can be provided to the processors of general-purpose computers, special-purpose computers, or other programmable data processing devices to produce a machine that, when executed by the processors of the computer or other programmable data processing devices, produces A device that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram. It is also possible to store these computer programs in a readable storage medium. These computer programs make computers, programmable data processing apparatuses and/or other devices work in a specific manner, so that the readable storage medium storing the computer programs includes a An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams. The computer program can also be loaded on a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process, so that A computer program executed on a computer, other programmable data processing device, or other device implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
需要说明的是,本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、电子设备、计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。It should be noted that each embodiment in this specification is described in a related manner, and the same or similar parts between the various embodiments can be referred to each other. Each embodiment focuses on the differences from other embodiments. . In particular, for the device, electronic equipment, and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiments.
在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。在本文中使用的术语仅用于描述特定实施方式的目的,并非旨在限制本发明。如本文中所使用的,单数形式“一(a)”、“一(an)”和“一(the)”旨在也包括复数形式,除非在上下文中清楚地另外指出。如本文中所使用的,术语“和/或”包括一个或多个相关的所列项目的任意的和所有的组合。当例如“中的至少一个” 的表述处于一列元件之后时修饰整列元件,而不是修饰该列中的个别元件。如本文中所使用的,术语“基本上”、“约”以及类似术语被用作近似术语,而不是程度术语,并且意在表示测量值或计算值中的固有偏差,所述偏差将被那些本领域普通技术人员识别。此外,在描述本发明的实施方式时,“可以”的使用指的是“本发明的一个或多个实施方式”。如本文中所使用的,术语“使用”、“正使用”和“使用了”可以被认为分别与术语“利用”、“正利用”和“利用了”是同义的。同样,术语“示例性”意在指出实例或示例。In this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such existence between these entities or operations. The actual relationship or order. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or device that includes a series of elements includes not only those elements, but also includes Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article, or equipment including the element. The terms used herein are only used for the purpose of describing specific embodiments and are not intended to limit the present invention. As used herein, the singular forms "a", "an", and "the" are intended to also include the plural forms, unless the context clearly indicates otherwise. As used herein, the term "and/or" includes any and all combinations of one or more related listed items. When expressions such as "at least one of" follow a list of elements, they modify the entire list of elements, but do not modify individual elements in the list. As used herein, the terms "substantially", "about" and similar terms are used as approximate terms, not degree terms, and are intended to indicate inherent deviations in measured or calculated values, which will be determined by those Recognized by those of ordinary skill in the art. In addition, when describing the embodiments of the present invention, the use of "may" refers to "one or more embodiments of the present invention." As used herein, the terms "used", "used" and "used" can be considered as synonymous with the terms "used", "used" and "used" respectively. Likewise, the term "exemplary" is intended to indicate an instance or example.
上述描述仅是对本发明较佳实施例的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于权利要求书的保护范围。The foregoing description is only a description of the preferred embodiments of the present invention and does not limit the scope of the present invention in any way. Any changes or modifications made by persons of ordinary skill in the field of the present invention based on the foregoing disclosure shall fall within the protection scope of the claims.

Claims (20)

  1. 一种样本标注审核方法,其特征在于,所述方法包括:A sample labeling review method, characterized in that the method includes:
    步骤1:获取待标注样本;Step 1: Obtain samples to be labeled;
    步骤2:通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;Step 2: Identify at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
    步骤3:通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;Step 3: Recognize each sample area picture through a preset recognition model and perform pre-labeling processing; wherein the preset recognition model is a neural network-based model;
    步骤4:将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;Step 4: Send the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, and if the pre-labeled result is verified as an error, the pre-labeled The results are modified;
    步骤5:将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。Step 5: Send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  2. 如权利要求1所述的样本标注审核方法,其特征在于,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling review method according to claim 1, wherein step 3 identifies each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
    步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
    将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人工客户端对样本区域图片的预标注结果进行审核。The pre-labeled sample area picture is sent to a manual client, so that the manual client can review the pre-labeled result of the sample area picture.
  3. 如权利要求2所述的样本标注审核方法,其特征在于,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The sample labeling and reviewing method according to claim 2, wherein the verification client in step 5 performs verification processing on the labeling information of the sample area picture after being reviewed by the review unit, comprising:
    针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  4. 如权利要求1所述的样本标注审核方法,其特征在于,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling review method according to claim 1, wherein step 3 identifies each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
    步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
    将通过不同预设识别模型进行预标注处理的样本区域图片发送给多个不同的人工客户端中的每一个,以使每个人工客户端对每个样本区域图片的预标注结果进行审核。The sample area pictures pre-labeled by different preset recognition models are sent to each of a plurality of different manual clients, so that each manual client can review the pre-labeled results of each sample area picture.
  5. 如权利要求4所述的样本标注审核方法,其特征在于,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The sample labeling and reviewing method according to claim 4, wherein the verification client in step 5 performs verification processing on the labeling information of the sample area pictures after being reviewed by the review unit, comprising:
    针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  6. 如权利要求1所述的样本标注审核方法,其特征在于,步骤3通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling review method according to claim 1, wherein step 3 identifies each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
    步骤4将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Step 4 sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
    将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
    通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;Recognize and label the pre-labeled sample area pictures through another preset recognition model;
    其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型。Wherein, the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets.
  7. 如权利要求6所述的样本标注审核方法,其特征在于,步骤5所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理, 包括:The sample labeling and reviewing method according to claim 6, wherein the verification client in step 5 performs verification processing on the labeling information of the sample area picture after the review unit has reviewed, comprising:
    针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后的和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. To identify.
  8. 如权利要求1所述的样本标注审核方法,其特征在于,步骤3进行预标注处理之后,还包括:The sample labeling review method according to claim 1, wherein after the pre-labeling processing in step 3, the method further comprises:
    从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果;Selecting a preset number of pictures from the sample area pictures recognized by the preset recognition model, and modifying the pre-labeled result of the selected preset number of pictures to a recognition result different from the original pre-labeled result;
    步骤4中将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:In step 4, the pre-labeled sample area pictures are sent to the review unit, so that the review unit can review the pre-labeled results of the sample area pictures, including:
    将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核;Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result Review the pre-annotated results of regional pictures;
    步骤5中所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还包括:After the verification client in step 5 performs verification processing on the label information of the sample area picture reviewed by the review unit, it also includes:
    所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,判定所述审核单元是否处于异常状态。The verification client determines whether the verification unit is in an abnormal state according to the review result of the review unit on the preset number of pictures of which the pre-marked result has been modified.
  9. 如权利要求8所述的样本标注审核方法,其特征在于,所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核,包括:The sample labeling review method according to claim 8, wherein the reviewing unit reviews the pre-labeled results of the sample area pictures of the unmodified pre-labeled results and the sample area pictures of the modified pre-labeled results, comprising:
    针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正确;如果否,则对该样本区域图片所标注的预标注结果进行修改。For each sample area picture, the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified.
  10. 如权利要求9所述的样本标注审核方法,其特征在于,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:The sample labeling review method according to claim 9, wherein the verification client checks the review unit according to the review result of the review unit on the preset number of pictures of the modified pre-marked result Whether it is in an abnormal state, including:
    针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
    获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
    若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  11. 如权利要求9所述的样本标注审核方法,其特征在于,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:The sample labeling review method according to claim 9, wherein the verification client checks the review unit according to the review result of the review unit on the preset number of pictures of the modified pre-marked result Whether it is in an abnormal state, including:
    针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
    获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
    若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
  12. 一种样本标注审核装置,其特征在于,所述装置包括:A sample labeling review device, characterized in that the device includes:
    获取模块,用于获取待标注样本;Obtaining module for obtaining samples to be labeled;
    识别模块,用于通过区域识别模型识别所述待标注样本的至少一个区域,并对所述至少一个区域进行切割形成至少一个样本区域图片;其中,所述区域识别模型为基于神经网络的模型;A recognition module, configured to recognize at least one region of the sample to be labeled through a region recognition model, and cut the at least one region to form at least one sample region picture; wherein the region recognition model is a neural network-based model;
    预标注模块,用于通过预设识别模型识别每个样本区域图片并进行预标注处理;其中,所述预设识别模型为基于神经网络的模型;The pre-labeling module is used to identify each sample area picture through a preset recognition model and perform pre-labeling processing; wherein, the preset recognition model is a neural network-based model;
    审核模块,用于将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,如果审核出所述预标注结果为错误则对所述预标注结果进行修改;The review module is configured to send the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, if the pre-labeled result is verified as If it is wrong, modify the pre-marked result;
    校验模块,用于将经过审核单元审核的样本区域图片发送给校验客户端,以使所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理。The verification module is configured to send the sample area pictures reviewed by the review unit to the verification client, so that the verification client performs verification processing on the label information of the sample area pictures reviewed by the review unit.
  13. 如权利要求12所述的样本标注审核装置,其特征在于,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling and reviewing device according to claim 12, wherein the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
    所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使 所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
    将经过预标注处理的样本区域图片发送给一个人工客户端,以使所述人工客户端对样本区域图片的预标注结果进行审核;Sending the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture;
    所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
    针对每个样本区域图片,校验客户端检验经过该人工客户端审核后的该样本区域图片的标注信息是否准确,如果不准确则重新对该样本区域图片进行识别。For each sample area picture, the verification client checks whether the labeling information of the sample area picture after the manual client's review is accurate, and if it is not accurate, re-identifies the sample area picture.
  14. 如权利要求12所述的样本标注审核装置,其特征在于,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling and reviewing device according to claim 12, wherein the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    将每个样本区域图片均通过至少两个预设识别模型分别进行识别并进行预标注处理;Recognize each sample area picture through at least two preset recognition models and perform pre-labeling processing;
    所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit, so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
    将通过不同预设识别模型进行预标注处理的样本区域图片发送给多个不同的人工客户端中的每一个,以使每个人工客户端对每个样本区域图片的预标注结果进行审核;Send the sample area pictures pre-labeled through different preset recognition models to each of multiple different manual clients, so that each manual client can review the pre-labeled results of each sample area picture;
    所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
    针对每个样本区域图片,校验客户端检验经过不同的人工客户端审核后的该样本区域图片的标注信息是否一致,如果不一致则对该样本区域图片重新进行识别。For each sample area picture, the verification client checks whether the label information of the sample area picture after the review by different manual clients is consistent, and if they are inconsistent, the sample area picture is re-identified.
  15. 如权利要求12所述的样本标注审核装置,其特征在于,所述预标注模块通过预设识别模型识别每个样本区域图片并进行预标注处理,包括:The sample labeling and reviewing device according to claim 12, wherein the pre-labeling module recognizes each sample area picture through a preset recognition model and performs pre-labeling processing, comprising:
    通过一个预设识别模型识别每个样本区域图片并进行预标注处理;Recognize each sample area picture through a preset recognition model and perform pre-labeling processing;
    所述审核模块将经过预标注处理的样本区域图片发送给审核单元,以使 所述审核单元对经过预标注处理的样本区域图片的预标注结果进行审核,包括:The review module sends the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the pre-labeled sample area pictures, including:
    将经过预标注处理的样本区域图片发送给一个人工客户端,以使该人工客户端对样本区域图片的预标注结果进行审核;以及Send the pre-labeled sample area picture to a manual client, so that the manual client can review the pre-labeled result of the sample area picture; and
    通过另一个预设识别模型对经过预标注处理的样本区域图片进行识别并进行标注;Recognize and label the pre-labeled sample area pictures through another preset recognition model;
    其中,所述一个预设识别模型与所述另一个预设识别模型是根据不同的训练样本集训练建立的识别模型;Wherein, the one preset recognition model and the another preset recognition model are recognition models established by training based on different training sample sets;
    所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理,包括:The verification process performed by the verification client on the label information of the sample area picture after being reviewed by the verification unit includes:
    针对每个样本区域图片,所述校验客户端检验经过人工客户端审核后和另一个预设识别模型识别后的该样本区域图片的标注信息是否一致,如果不一致则重新对该样本区域图片进行识别。For each sample area picture, the verification client verifies whether the label information of the sample area picture after the manual client's review is consistent with that of another preset recognition model. Recognition.
  16. 如权利要求12所述的样本标注审核装置,其特征在于,所述预标注模块进行预标注处理之后,还用于:The sample labeling review device according to claim 12, wherein the pre-labeling module is further used for:
    从所述预设识别模型识别的样本区域图片中选取预设数量个图片,并将所选取的预设数量个图片的预标注结果修改为与原始预标注结果不同的识别结果;Selecting a preset number of pictures from the sample area pictures recognized by the preset recognition model, and modifying the pre-labeled result of the selected preset number of pictures to a recognition result different from the original pre-labeled result;
    将经过预标注处理的样本区域图片发送给审核单元,以使所述审核单元对样本区域图片的预标注结果进行审核,包括:Sending the pre-labeled sample area pictures to the review unit so that the review unit can review the pre-labeled results of the sample area pictures, including:
    将未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片都发送给审核单元,以使所述审核单元对未修改预标注结果的样本区域图片和已修改预标注结果的样本区域图片的预标注结果进行审核,包括:Send both the sample area picture of the unmodified pre-labeled result and the sample area picture of the modified pre-labeled result to the review unit, so that the review unit can check the sample area picture of the unmodified pre-labeled result and the sample of modified pre-labeled result The pre-annotated results of regional pictures are reviewed, including:
    针对每一样本区域图片,所述审核单元判断所标注的预标注结果是否正确;如果否,则对该样本区域图片所标注的预标注结果进行修改;For each sample area picture, the review unit judges whether the marked pre-marking result is correct; if not, the pre-marking result marked on the sample area picture is modified;
    所述校验客户端对经过审核单元审核后的样本区域图片的标注信息进行校验处理之后,还用于:After the verification client performs verification processing on the annotation information of the sample area picture that has been reviewed by the review unit, it is also used to:
    根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结 果,检查所述审核单元是否处于异常状态。According to the review result of the review unit on the preset number of pictures for which the pre-annotation result has been modified, it is checked whether the review unit is in an abnormal state.
  17. 如权利要求16所述的样本标注审核装置,其特征在于,所述校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:The sample labeling and reviewing device according to claim 16, wherein the verification client checks the reviewing unit according to the reviewing result of the reviewing unit on the preset number of pictures of the modified pre-marking result Whether it is in an abnormal state, including:
    针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否对该图片的预标注结果进行了修改;For each picture in the preset number of pictures for which the pre-annotation result has been modified, determine whether the review unit has modified the pre-annotation result of the picture;
    获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改了预标注结果的图片的比例,作为第一比例;Acquiring, as the first ratio, a ratio of pictures whose pre-annotation results have been modified by the review unit among the preset number of pictures for which the pre-annotation results have been modified;
    若所述第一比例小于预设阈值,则判定所述审核单元处于异常状态。If the first ratio is less than a preset threshold, it is determined that the audit unit is in an abnormal state.
  18. 如权利要求16所述的样本标注审核装置,其特征在于,所述检验模块中校验客户端根据所述审核单元对已修改预标注结果的所述预设数量个图片的审核结果,检查所述审核单元是否处于异常状态,包括:The sample labeling and reviewing device according to claim 16, wherein the verification client in the verification module checks all the pictures according to the review results of the review unit on the preset number of pictures of the modified pre-labeled results. Describe whether the audit unit is in an abnormal state, including:
    针对已修改预标注结果的所述预设数量个图片中的每一图片,判断所述审核单元是否将该图片的预标注结果修改为正确识别结果;For each picture in the preset number of pictures for which the pre-labeled result has been modified, determine whether the review unit modifies the pre-labeled result of the picture to a correct recognition result;
    获取在已修改预标注结果的所述预设数量个图片中被所述审核单元修改为正确识别结果的图片的比例,作为第二比例;Acquiring, as a second ratio, a ratio of the pictures that have been modified as a correct recognition result by the review unit among the preset number of pictures of the modified pre-annotation result;
    若所述第二比例小于预设阈值,则判定所述审核单元处于异常状态。If the second ratio is less than the preset threshold, it is determined that the audit unit is in an abnormal state.
  19. 一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;An electronic device, characterized by comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete mutual communication through the communication bus;
    所述存储器,用于存放计算机程序;The memory is used to store computer programs;
    所述处理器,用于执行所述存储器上所存放的所述计算机程序时,实现权利要求1-11中任一所述的方法。The processor is configured to implement the method according to any one of claims 1-11 when executing the computer program stored on the memory.
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-11中任一项所述的方法。A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method according to any one of claims 1-11 is implemented.
PCT/CN2020/096647 2019-06-20 2020-06-17 Sample labeling checking method and device WO2020253742A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910538182.0 2019-06-20
CN201910538182.0A CN110245716B (en) 2019-06-20 2019-06-20 Sample labeling auditing method and device

Publications (1)

Publication Number Publication Date
WO2020253742A1 true WO2020253742A1 (en) 2020-12-24

Family

ID=67888381

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096647 WO2020253742A1 (en) 2019-06-20 2020-06-17 Sample labeling checking method and device

Country Status (2)

Country Link
CN (1) CN110245716B (en)
WO (1) WO2020253742A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705691A (en) * 2021-08-30 2021-11-26 平安国际智慧城市科技股份有限公司 Image annotation checking method, device, equipment and medium based on artificial intelligence

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245716B (en) * 2019-06-20 2021-05-14 杭州睿琪软件有限公司 Sample labeling auditing method and device
CN111325260B (en) * 2020-02-14 2023-10-27 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and computer readable medium
CN111428749A (en) * 2020-02-21 2020-07-17 平安科技(深圳)有限公司 Image annotation task pre-verification method, device, equipment and storage medium
CN111339353B (en) * 2020-02-21 2023-06-09 北京容联易通信息技术有限公司 Image processing self-optimization method and device, electronic equipment and storage medium
CN111833296B (en) * 2020-05-25 2023-03-10 中国人民解放军陆军军医大学第二附属医院 Automatic detection and verification system and method for bone marrow cell morphology
CN111832449A (en) * 2020-06-30 2020-10-27 万翼科技有限公司 Engineering drawing display method and related device
CN111881657A (en) * 2020-08-04 2020-11-03 厦门渊亭信息科技有限公司 Intelligent marking method, terminal equipment and storage medium
CN112070224B (en) * 2020-08-26 2024-02-23 成都品果科技有限公司 Revision system and method of samples for neural network training
CN112508092A (en) * 2020-12-03 2021-03-16 上海云从企业发展有限公司 Sample screening method, system, equipment and medium
CN112836732B (en) * 2021-01-25 2024-04-19 深圳市声扬科技有限公司 Verification method and device for data annotation, electronic equipment and storage medium
CN113111716B (en) * 2021-03-15 2023-06-23 中国科学院计算机网络信息中心 Remote sensing image semiautomatic labeling method and device based on deep learning
CN112906349A (en) * 2021-03-30 2021-06-04 苏州大学 Data annotation method, system, equipment and readable storage medium
CN113221999B (en) * 2021-05-06 2024-01-12 北京百度网讯科技有限公司 Picture annotation accuracy obtaining method and device and electronic equipment
CN113487706A (en) * 2021-07-26 2021-10-08 上海中通吉网络技术有限公司 Data annotation method and platform applied to intelligent logistics field
CN114120052B (en) * 2021-12-02 2023-06-27 成都智元汇信息技术股份有限公司 Self-learning multi-scheduling cloud labeling platform, working method, electronic equipment and medium
CN115223166A (en) * 2022-09-20 2022-10-21 整数智能信息技术(杭州)有限责任公司 Picture pre-labeling method, picture labeling method and device, and electronic equipment
CN116934246A (en) * 2023-06-20 2023-10-24 联城科技(河北)股份有限公司 Method, device, equipment and readable storage medium for auditing declaration project data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649610A (en) * 2016-11-29 2017-05-10 北京智能管家科技有限公司 Image labeling method and apparatus
WO2018119684A1 (en) * 2016-12-27 2018-07-05 深圳前海达闼云端智能科技有限公司 Image recognition system and image recognition method
CN109635838A (en) * 2018-11-12 2019-04-16 平安科技(深圳)有限公司 Face samples pictures mask method, device, computer equipment and storage medium
CN110245716A (en) * 2019-06-20 2019-09-17 杭州睿琪软件有限公司 Sample labeling auditing method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10558029B2 (en) * 2016-10-27 2020-02-11 Scopio Labs Ltd. System for image reconstruction using a known pattern
CN107451615A (en) * 2017-08-01 2017-12-08 广东工业大学 Thyroid papillary carcinoma Ultrasound Image Recognition Method and system based on Faster RCNN
CN109697537A (en) * 2017-10-20 2019-04-30 北京京东尚科信息技术有限公司 The method and apparatus of data audit
CN109214382A (en) * 2018-07-16 2019-01-15 顺丰科技有限公司 A kind of billing information recognizer, equipment and storage medium based on CRNN
CN109492549A (en) * 2018-10-24 2019-03-19 杭州睿琪软件有限公司 A kind of processing of training sample set, model training method and system
CN109635110A (en) * 2018-11-30 2019-04-16 北京百度网讯科技有限公司 Data processing method, device, equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649610A (en) * 2016-11-29 2017-05-10 北京智能管家科技有限公司 Image labeling method and apparatus
WO2018119684A1 (en) * 2016-12-27 2018-07-05 深圳前海达闼云端智能科技有限公司 Image recognition system and image recognition method
CN109635838A (en) * 2018-11-12 2019-04-16 平安科技(深圳)有限公司 Face samples pictures mask method, device, computer equipment and storage medium
CN110245716A (en) * 2019-06-20 2019-09-17 杭州睿琪软件有限公司 Sample labeling auditing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705691A (en) * 2021-08-30 2021-11-26 平安国际智慧城市科技股份有限公司 Image annotation checking method, device, equipment and medium based on artificial intelligence
CN113705691B (en) * 2021-08-30 2024-04-09 深圳平安智慧医健科技有限公司 Image annotation verification method, device, equipment and medium based on artificial intelligence

Also Published As

Publication number Publication date
CN110245716B (en) 2021-05-14
CN110245716A (en) 2019-09-17

Similar Documents

Publication Publication Date Title
WO2020253742A1 (en) Sample labeling checking method and device
WO2021003819A1 (en) Man-machine dialog method and man-machine dialog apparatus based on knowledge graph
US10783331B2 (en) Method and apparatus for building text classification model, and text classification method and apparatus
US20190095758A1 (en) Method and system for obtaining picture annotation data
WO2021043085A1 (en) Method and apparatus for recognizing named entity, computer device, and storage medium
WO2020119075A1 (en) General text information extraction method and apparatus, computer device and storage medium
WO2020253740A1 (en) Manual client status check method and device for sample verification
WO2018166114A1 (en) Picture identification method and system, electronic device, and medium
WO2020087713A1 (en) Video quality inspection method and apparatus, computer device and storage medium
WO2020253741A1 (en) Method and device for checking status of manual client by using error samples
WO2018157840A1 (en) Speech recognition testing method, testing terminal, computer device, and storage medium
CN109473093B (en) Speech recognition method, device, computer equipment and storage medium
TW201837788A (en) Character recognition method and server for claim documents
WO2019196205A1 (en) Foreign language teaching evaluation information generating method and apparatus
CN110795482B (en) Data benchmarking method, device and storage device
WO2020019591A1 (en) Method and device used for generating information
WO2018153316A1 (en) Method and apparatus for obtaining text extraction model
CN106485261B (en) Image recognition method and device
CN112380981A (en) Face key point detection method and device, storage medium and electronic equipment
US20170185913A1 (en) System and method for comparing training data with test data
CN110555096A (en) User intention identification method, system, terminal and medium
US20150199962A1 (en) Classifying spoken content in a teleconference
CN109657675B (en) Image annotation method and device, computer equipment and readable storage medium
CN112330214A (en) Contract review method and device and readable storage medium
US20140122069A1 (en) Automatic Speech Recognition Accuracy Improvement Through Utilization of Context Analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20827130

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20827130

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20827130

Country of ref document: EP

Kind code of ref document: A1