CN115620039B - Image labeling method, device, equipment and medium - Google Patents

Image labeling method, device, equipment and medium Download PDF

Info

Publication number
CN115620039B
CN115620039B CN202211223608.1A CN202211223608A CN115620039B CN 115620039 B CN115620039 B CN 115620039B CN 202211223608 A CN202211223608 A CN 202211223608A CN 115620039 B CN115620039 B CN 115620039B
Authority
CN
China
Prior art keywords
labeling
image
confidence coefficient
preset
labeled
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211223608.1A
Other languages
Chinese (zh)
Other versions
CN115620039A (en
Inventor
刘峰
刘洋
刘渊
周进洋
张科
杨明
段焱丰
汪晗韬
黄宇
孙佩豪
符颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongdian Jinxin Software Co Ltd
Original Assignee
Zhongdian Jinxin Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongdian Jinxin Software Co Ltd filed Critical Zhongdian Jinxin Software Co Ltd
Priority to CN202211223608.1A priority Critical patent/CN115620039B/en
Publication of CN115620039A publication Critical patent/CN115620039A/en
Application granted granted Critical
Publication of CN115620039B publication Critical patent/CN115620039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an image labeling method, an image labeling device, image labeling equipment and an image labeling medium, and relates to the field of data labeling. The method comprises the following steps: pre-labeling the image to be labeled by a pre-labeling engine, and determining the comprehensive confidence of the pre-labeling result; if the comprehensive confidence coefficient is larger than a preset threshold value, outputting a pre-labeling result; and if the comprehensive confidence coefficient is not greater than the preset threshold value, carrying out manual labeling processing on the image to be labeled based on the input labeling information, and outputting the labeling information. The scheme provided by the embodiment of the application organically combines the respective advantages of the machine labeling mode and the manual labeling mode, and can effectively improve the labeling efficiency and the accuracy.

Description

Image labeling method, device, equipment and medium
Technical Field
The present application relates to the field of data labeling, and in particular, to an image labeling method, apparatus, device, and medium.
Background
With the advanced development of the photo age, more and more key information or high-quality content is transmitted and stored by taking the photo as a carrier, so that great convenience is brought to the transmission of the content. There is a trend to label pictures including text to obtain and utilize the text content of the pictures. Key steps such as many AI items include the process of labeling pictures and utilizing the labeled content.
At present, the related technology mostly adopts single machine or single person marking, and the organization difficulty of marking work is great. And only the marking result is confirmed by human eyes, so that the marking efficiency is low and the marking cost is high. And the efficiency and instruction of data tagging severely hampers the implementation of many AI projects.
Disclosure of Invention
An embodiment of the present application is to provide an image labeling method, an image labeling device, and an image labeling medium, so as to solve one of the above technical problems. To achieve this object, embodiments of the present application provide the following solutions.
In one aspect, an embodiment of the present application provides an image labeling method, where the method includes:
performing pre-labeling treatment on the images to be labeled through a plurality of pre-labeling engines, and determining the comprehensive confidence coefficient of the pre-labeling result; if the comprehensive confidence coefficient is larger than a preset threshold value, outputting a pre-labeling result; and if the comprehensive confidence coefficient is not greater than a preset threshold value, carrying out manual annotation processing on the image to be annotated based on the input annotation information, and outputting the annotation information.
Optionally, before the image to be annotated is pre-annotated by a plurality of preset pre-annotation engines, the method further includes:
processing the first image based on a preset image correction mode to obtain a second image conforming to the labeling condition; the image correction mode comprises at least one of the following: a noise reduction mode, an angle correction mode and a distortion correction mode; and processing the second image based on the preset target range and a plurality of preset channels to obtain an image to be marked.
Optionally, processing the second image based on a preset target range and a plurality of preset channels to obtain an image to be annotated, including:
selecting a target value from a preset target range and selecting a target interpolation mode; scaling the second image based on the target value and the target interpolation mode respectively to obtain an image to be converted including the second image; performing corresponding channel conversion processing on each image to be converted based on at least one channel of the color channel, the gray channel and the binarization channel to obtain an image to be marked; the image to be annotated comprises at least one of a color image, a gray image and a binarized image.
Optionally, pre-labeling the image to be labeled through a plurality of pre-labeling engines, including:
and respectively carrying out pre-labeling treatment on each image to be labeled through a preset first pre-labeling engine, a second pre-labeling engine and a third pre-labeling engine to obtain a plurality of pre-labeling results corresponding to each image to be labeled.
Optionally, determining the comprehensive confidence of the pre-labeling result includes:
if all the pre-labeling results are characterized as consistency results, checking the consistency results based on a preset checking mode; the verification method comprises the following steps: a field style NLP check and/or a field validity regular check; the consistency result is that all the pre-labeling results are the same text data; and if the verification is successful, acquiring the comprehensive confidence coefficient.
Optionally, the consistency result is to obtain a unified character string; acquiring the comprehensive confidence level, including:
performing shift segmentation on the unified character string according to the number of the three characters to obtain a plurality of sub-character strings; calculating the confidence coefficient of each sub-character string; and determining the smallest confidence in the confidence of each substring as the comprehensive confidence.
Optionally, the manually labeling the image to be labeled based on the input labeling information includes:
clustering all the images to be marked according to a preset mode to obtain a plurality of image sets consisting of the same or similar images; assigning a corresponding image set for each labeling operator according to the operation information of the labeling operation object; wherein each image set is assigned for processing by at least two different labeling operators; the operation information comprises historical labeling operation and/or real-time labeling operation of labeling operation objects; and if the labeling information input by each labeling operator is consistent, outputting the labeling information.
Optionally, the method further comprises:
if all the pre-labeling results are characterized as non-consistency results, or if the comprehensive confidence coefficient is not greater than a preset threshold value, taking the image to be labeled as a sample image for training the pre-labeling engine, and setting the priority of the image to be labeled to be higher than that of other sample images so as to train the pre-labeling engine.
In another aspect, an embodiment of the present application provides an image labeling apparatus, including:
the pre-marking module is used for pre-marking the image to be marked through a pre-marking engine,
and the determining module is used for determining the comprehensive confidence coefficient of the pre-labeling result.
And the first output module is used for outputting a pre-labeling result if the comprehensive confidence coefficient is larger than a preset threshold value.
And the second output module is used for carrying out manual labeling processing on the image to be labeled based on the input labeling information and outputting the labeling information if the comprehensive confidence coefficient is not greater than a preset threshold value.
In still another aspect, an embodiment of the present application provides an electronic device, including:
the image labeling method comprises the steps of a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of the image labeling method provided by the embodiment of the application.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of an image labeling method provided by the embodiment of the application.
The beneficial effects that technical scheme that this application embodiment provided brought are:
the embodiment of the application provides an image labeling method, which is used for carrying out pre-labeling treatment on an image to be labeled through a plurality of pre-labeling engines and determining the comprehensive confidence of a pre-labeling result. Outputting a pre-labeling result when the comprehensive confidence coefficient is larger than a preset threshold value, labeling the image to be labeled based on manually input labeling information when the comprehensive confidence coefficient is not larger than the preset threshold value, and outputting the labeling information as a labeling result. The labeling mode based on the pre-labeling engine is a machine labeling mode, and the comprehensive confidence is a specific numerical value reflecting the accuracy of the machine labeling mode. The scheme provided by the embodiment of the application firstly adopts a machine labeling mode to label the image, so that the labeling efficiency can be effectively improved. In the process of implementing the machine labeling mode, the accuracy of the labeling result obtained in the mode is evaluated, and the labeling process can be further performed in a manual mode under the condition that the accuracy does not reach the standard, so that the accuracy of the image labeling result is ensured. In general, the scheme provided by the embodiment of the application organically combines the respective advantages of the machine labeling mode and the manual labeling mode, and can effectively improve the labeling efficiency and the accuracy.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.
FIG. 1a is an image of the related art;
FIG. 1b is a block of the image of FIG. 1 a;
fig. 2 is a schematic flow chart of an image labeling method according to an embodiment of the present application;
fig. 3a is a schematic structural diagram of an image labeling device according to an embodiment of the present application;
fig. 3b is a schematic structural diagram of another image labeling device according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present application. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates that at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The embodiment of the application provides an image labeling method which is realized based on a pre-labeling engine. Alternatively, the pre-labeling engine may be a neural network model. The neural network model can be obtained by training a plurality of sample images. The text content contained in the sample image may be a specific text type, such as a term of art, for example, a language (chinese, english). Alternatively, the sample image may be an image block containing a single line of text. Therein, a plurality of tiles are shown in FIG. 1, such as tiles containing "advertiser", tiles containing "4000 10 4008".
Optionally, the embodiment of the application provides an image labeling method which can be applied to any electronic device. Optionally, the electronic device includes a plurality of pre-annotation engines. When the electronic equipment is in an operating state, performing pre-labeling treatment on the images to be labeled through a plurality of pre-labeling engines to obtain labeling results, namely performing pre-labeling treatment in a machine labeling mode; and determining the comprehensive confidence coefficient of the pre-labeling result through the confidence coefficient comprehensive calculation node, namely performing accuracy evaluation on the labeling result of the machine labeling mode. And outputting a pre-labeling result under the condition that the comprehensive confidence coefficient is larger than a preset threshold value, labeling the image to be labeled based on manually input labeling information under the condition that the comprehensive confidence coefficient is not larger than the preset threshold value, and outputting the labeling information as the labeling result. The scheme provided by the embodiment of the application firstly adopts a machine labeling mode to label the image, so that the labeling efficiency can be effectively improved. In the process of implementing the machine labeling mode, the accuracy of the labeling result obtained in the mode is evaluated, and the labeling process can be further performed in a manual mode under the condition that the accuracy does not reach the standard, so that the accuracy of the image labeling result is ensured. In general, the method provided by the embodiment of the application organically combines the respective advantages of the machine labeling mode and the manual labeling mode, and can effectively improve the labeling efficiency and the accuracy.
Alternatively, the method provided in the embodiments of the present application may be implemented as a stand-alone application or as a functional module/plug-in of an application. For example, the application program can be a special image annotation or other application program with an image annotation function, and through the application program, efficient and accurate image annotation can be realized.
The technical solutions of the embodiments of the present application and technical effects produced by the technical solutions of the present application are described below by describing several exemplary embodiments. It should be noted that the following embodiments may be referred to, or combined with each other, and the description will not be repeated for the same terms, similar features, similar implementation steps, and the like in different embodiments.
Fig. 2 shows a flow diagram of an image annotation method. Specifically, the method includes the following steps S210 to S240.
S210, performing pre-labeling processing on the image to be labeled through a plurality of pre-labeling engines.
Optionally, before the image to be annotated is pre-annotated by a plurality of preset pre-annotation engines, the method further includes:
processing the first image based on a preset image correction mode to obtain a second image conforming to the labeling condition; the image correction mode comprises at least one of the following: a noise reduction mode, an angle correction mode and a distortion correction mode; and processing the second image based on the preset target range and a plurality of preset channels to obtain an image to be marked.
Optionally, if the first image has a stain, or the first image is yellowish, the image may be processed in a noise reduction manner to remove the stain, or the stain; if the characters on the first image are inclined relative to the horizontal line, correcting the characters by an angle correcting mode; if the text content on the first image shows a distorted state, the first image can be processed in a distortion mode, or the text content on the same horizontal line can be processed.
Optionally, the first image is an original image of acceptable quality. Specifically, after any original image is received, the original image can be detected through a preset image quality detection assembly, and if the detection result is unqualified, the labeling process is abandoned; if the detection result is qualified, the original image is taken as a first image, and then the processing is carried out. The criteria for detecting image failure are various, including: if the key information in the image is lost, if the detected image is the information of the identification card number, if only 8 numbers are in the original image block, the detected image can be determined as unqualified; comparing and blurring the images; a plurality of stains in the image. It should be noted that the image quality detection process may also take other manners, which are not described herein for simplicity.
Referring to fig. 1a for example, fig. 1a includes 4 lines from which 4 image blocks may be segmented, each of which may be used as an original image for the labeling process. As shown in fig. 1b, the content in the image block is "advertiser".
In an optional embodiment, the technical means for performing pre-labeling processing on the image to be labeled in S210 through a plurality of preset pre-labeling engines may specifically include:
and respectively carrying out pre-labeling treatment on each image to be labeled through a preset first pre-labeling engine, a second pre-labeling engine and a third pre-labeling engine to obtain a plurality of pre-labeling results corresponding to each image to be labeled.
Optionally, the pre-labeling engine is a neural network model, and is trained by a plurality of pre-created sample images. The first pre-labeling engine, the second pre-labeling engine, and the third pre-labeling engine may be based on the same neural network model that is not trained. When the same neural network model is trained, different parameters can be set for different convolution layers of the same neural network model so as to achieve different training results, and therefore different neural network models are obtained. It should be noted that, for the training process of the neural network model, reference may be made to the related art, and for simplicity of description, details are not repeated here.
In an alternative embodiment, the second image is processed based on the preset target range and the preset channels to obtain the image to be annotated, which may specifically include the following steps Sa1 to Sa3.
Sa1, selecting a target value from a preset target range, and selecting a target interpolation mode.
The preset target range is a size range for performing zooming-in or zooming-out processing on the second image. Alternatively, the target range may be (0.75,1.35), i.e., the minimum reduction value is 0.75 and the maximum amplification value is 1.35. Since the reduction value is lower than 0.75, the text on the second image becomes smaller, and the enlargement value is higher than 1.35, the second image is distorted. Both of the above cases can result in the pre-annotation engine not being able to identify the image and therefore annotating the error.
Alternatively, the target value may be one or more; the target interpolation algorithm mode can be one or a plurality of. The manner in which the target value is selected may be a random manner, or a specified value. The process of selecting the target interpolation mode can also be a random mode or a specified interpolation mode. And Sa2, respectively performing scaling processing on the second image based on the target value and the target interpolation algorithm to obtain an image to be converted including the second image.
Alternatively, zoom-in or zoom-out may be performed by calling the resize () function in openCV (a cross-platform computer vision and machine learning software library, including various image processing functions). Wherein the restore () function is associated with several interpolation methods, one or more interpolation methods being selectable from the associated several interpolation methods for use in processing the second image. Such as nearest neighbor interpolation, bilinear interpolation, region interpolation, etc. The openCV is used as a common image processing library in the related art, and comprises various algorithms in image processing and computer vision.
The image to be marked comprises an unprocessed second image besides the image obtained by processing the second image.
Sa3, carrying out corresponding channel conversion processing on each image to be converted based on at least one channel of a color channel, a gray channel and a binarization channel to obtain an image to be marked; the image to be annotated comprises at least one of a color image, a gray image and a binarized image.
Combining different interpolation modes and different scaling values, and carrying out amplification or reduction treatment on the second image, so that the obtained images to be converted have differences as much as possible, and the characteristics of a certain part of the image are amplified; and the image to be converted is processed by adopting a plurality of preset channels, and the image to be marked with the amplified characteristics in a certain aspect can be obtained. So that the pre-labeling engine can identify different images, and the success rate of the identification is improved as much as possible.
S220, determining the comprehensive confidence of the pre-labeling result.
Optionally, if all the pre-labeling results are characterized as consistency results, checking is performed on the consistency results based on a preset checking mode. The verification mode comprises the following steps: a field style NLP check and/or a field validity regular check. The consistency result is that all the pre-labeling results are the same text data. And if the verification is successful, acquiring the comprehensive confidence coefficient.
After the image to be marked is pre-marked through a plurality of pre-marking engines, the image to be marked with marking information and the corresponding confidence coefficient of each marking information are obtained, and the marking information and the corresponding confidence coefficient can be understood as a pre-marking result. Further, uniformly matching the character strings marked by each image to be marked, and if all the character strings are the same, determining that the pre-marked result is characterized as a consistency result; if at least one character string is different from other character strings, determining that the pre-labeling result is a non-consistency result.
Optionally, if the non-consistency result is obtained, labeling the image to be labeled based on manually input labeling information, and outputting the labeling information.
Referring to the original image shown in fig. 1b, after the original image is processed, the labeling result of a certain labeled image is a character string "advertisement poster", and the character string can be expressed as: [ "advertisement", "solicitation", "quotient" ].
After the consistency result is obtained, the unified character string is obtained, and the unified character string can be checked based on the field style NLP to determine whether the unified character string does not conform to the natural language style. Alternatively, a field validity check may be performed to determine whether the Unicode string is valid. It should be noted that the verification for the unicode string may also refer to the related art.
Optionally, if the verification fails, labeling the image to be labeled based on the input labeling information, and outputting the labeling information.
S230, if the comprehensive confidence coefficient is larger than a preset threshold value, outputting a pre-labeling result.
S240, if the comprehensive confidence coefficient is not greater than a preset threshold value, carrying out manual labeling processing on the image to be labeled based on the input labeling information.
The method comprises the steps of marking the image to be marked based on manually input marking information, namely, manually processing the image to be marked.
The embodiment of the application provides an image labeling method, which is used for carrying out pre-labeling treatment on an image to be labeled through a plurality of pre-labeling engines and determining the comprehensive confidence of a pre-labeling result. Outputting a pre-labeling result when the comprehensive confidence coefficient is larger than a preset threshold value, labeling the image to be labeled based on manually input labeling information when the comprehensive confidence coefficient is not larger than the preset threshold value, and outputting the labeling information as a labeling result. The labeling mode based on the pre-labeling engine is a machine labeling mode, and the comprehensive confidence is a specific numerical value reflecting the accuracy of the machine labeling mode. The scheme provided by the embodiment of the application firstly adopts a machine labeling mode to label the image, so that the labeling efficiency can be effectively improved. In the process of implementing the machine labeling mode, the accuracy of the labeling result obtained in the mode is evaluated, and under the condition that the accuracy does not reach the standard, the manual labeling process can be further performed in a manual mode, so that the accuracy of the image labeling result is ensured. In general, the method provided by the embodiment of the application organically combines the respective advantages of the machine labeling mode and the manual labeling mode, and can effectively improve the labeling efficiency and the accuracy.
Next, details will be set forth on how to acquire the integrated confidence.
In an alternative embodiment, the process of obtaining the integrated confidence level includes the following steps Sb 1-Sb 3.
And Sb1, carrying out shift segmentation on the unified character string according to the unit quantity of the three characters to obtain a plurality of sub-character strings.
Illustratively, after shift division of the unified string [ "advertisement", "notice", "quotient" ], two substrings may be obtained, respectively: [ "advertisement", "notice", "recruitment" ], [ "notice", "recruitment", "quotient" ].
And Sb2, calculating the confidence coefficient of each sub-character string.
First, the error variance is calculated. Taking one substring as an example, historical data is queried for the substring, and the following data can be obtained: accumulating and identifying the number of the substrings, namely a first number; in the process of identifying the substring, a second number of substrings enter a manual confirmation link; in the link of manual confirmation, the modification of the error character may be involved, after each modification of the sub-string, the characters in the sub-string different from the modified sub-string are listed as error characters, and the proportion of the error character in the sub-string, that is, the error proportion of the error character, is counted. Further, the mean value of all error proportions is counted, and error variance of the error character is calculated according to the error mean value.
And secondly, acquiring the corresponding confidence coefficient of each pre-labeling result from all the pre-labeling results, and screening the minimum confidence coefficient from the confidence coefficient.
Finally, the confidence of the substring is determined according to the minimum confidence and the error variance. Optionally, a first confidence threshold, a second confidence threshold, and a third confidence threshold, and a variance test value are preset. If the minimum confidence coefficient is greater than the first confidence coefficient threshold value and the error variance is smaller than the variance check value, calculating the confidence coefficient of the substring according to the following formula 1; if the minimum confidence coefficient is larger than the second confidence coefficient threshold value and not larger than the first confidence coefficient threshold value, and the error variance is smaller than the variance check value, calculating the confidence coefficient of the substring according to the following formula 2; if the minimum confidence coefficient is larger than the third confidence coefficient threshold value and not larger than the second confidence coefficient threshold value, and the error variance is smaller than the variance check value, calculating the confidence coefficient of the substring according to the following formula 3; if the minimum confidence is less than the third confidence threshold, the confidence of the substring is calculated according to equation 4 below.
Confidence = 1-second number/first number; equation 1
Confidence = 0.9-second number/first number; equation 2
Confidence = 0.8-second number/first number; equation 3
Confidence = 0.7-second number/first number; equation 4
In one implementation scenario, the first confidence threshold, the second confidence threshold, and the third confidence threshold may be 0.9,0.85 and 0.7, respectively; the variance test value was 0.01.
And Sb3, determining the smallest confidence coefficient in the confidence coefficient of each sub-character string as the comprehensive confidence coefficient.
Specifically, after obtaining the confidence coefficient of each sub-string of the string, the minimum confidence coefficient is selected from all confidence coefficients, and is determined as the integrated confidence coefficient.
When the comprehensive confidence coefficient is smaller than a preset threshold value, the marking accuracy of the pre-marking engine is lower, and the pre-marking engine is not suitable to be used as a final result and output. Accordingly, the present embodiments also provide an implementation to address this problem.
In an alternative embodiment, the labeling process is performed on the image to be labeled based on the input labeling information, and specifically includes the following steps Sc1 to Sc3.
And Sc1, clustering all the images to be marked according to a preset mode to obtain a plurality of image sets consisting of the same or similar images.
Optionally, at least one original image may be subjected to image annotation when the data annotation is performed. If a plurality of original images are identified by people, the corresponding images to be marked of the same original image can be clustered to obtain a group of images to be identified. Or clustering the images to be marked with the marking information close to each other to obtain a group of images to be confirmed.
Sc2, assigning corresponding image sets for each labeling operator according to the operation information of the labeling operation object; wherein each image set is assigned for processing by at least two different labeling operators; the operation information includes a history labeling operation and/or a real-time labeling operation for labeling the operation object.
Optionally, a history labeling item or a real-time labeling item of each labeling operation object is obtained. And determining the matching degree of the image to be confirmed and each labeling operator according to the similarity degree of the labeling information in the historical labeling item and/or the real-time labeling item and the labeling information of each group of images to be confirmed. Then, each image to be confirmed is allocated to the first few labeling operators with higher matching degree. Wherein at least two marking operators need to be assigned.
And Sc3, outputting the labeling result if the labeling result input by each labeling operator is consistent.
Specifically, if all the labeling information input by the labeling operators are consistent, the consistent labeling information is output. If at least one marking operator inputs different marking information, the marking operator with higher authority can be given to carry out final confirmation.
In the labeling process of the original image, a plurality of images to be labeled with labeling errors can appear, and the images to be labeled need to be manually confirmed. Therefore, in the process of labeling images, such images to be labeled consume much effort, and whether the images can be reused or not can also become a problem. To this end, embodiments of the present application also provide an implementation that leverages such images.
In an alternative embodiment, the method may further comprise:
and if all the pre-labeling results are characterized as non-consistency results, taking the images to be labeled as sample images for training the pre-labeling engine, and setting the priority of the images to be labeled higher than that of other sample images.
Specifically, when the pre-labeling result of the image to be labeled is characterized as a non-consistency result, the labeling information output by a certain pre-labeling engine is deviated from the labeling information output by other pre-labeling engines, and the image to be labeled can be used as a sample image to train the pre-labeling engine. And, during training, the priority of the images to be marked can be set to distinguish the common sample images.
Optionally, the proportion of the sample images in all the sample images can be determined based on the sample distribution of the training strategy when the pre-labeling engine is trained, so that the quality of training data is effectively improved.
Fig. 3a illustrates an image labeling apparatus according to an embodiment of the present application. As shown in fig. 3a, the apparatus 300 includes a pre-labeling module 310, a determining module 320, a first output module 330, and a second output module 340.
The pre-labeling module 310 is configured to perform pre-labeling processing on the image to be labeled through a plurality of pre-labeling engines.
A determining module 320, configured to determine a comprehensive confidence level of the pre-labeling result.
The first output module 330 is configured to output a pre-labeling result if the integrated confidence coefficient is greater than a preset threshold.
And the second output module 340 is configured to perform manual labeling processing on the image to be labeled based on the input labeling information and output the labeling information if the integrated confidence is not greater than the preset threshold.
Optionally, as shown in fig. 3b, the apparatus 300 further includes a preprocessing module 350, specifically configured to, before pre-labeling the image to be labeled by a plurality of preset pre-labeling engines:
processing the first image based on a preset image correction mode to obtain a second image conforming to the labeling condition; the image correction mode comprises at least one of the following: a noise reduction mode, an angle correction mode and a distortion correction mode; and processing the second image based on the preset target range and a plurality of preset channels to obtain an image to be marked.
Optionally, the preprocessing module 350 processes the second image based on the preset target range and the preset channels to obtain an image to be annotated, which is specifically configured to:
selecting a target value from a preset target range and selecting a target interpolation mode; scaling the second image based on the target value and the target interpolation mode respectively to obtain an image to be converted including the second image; performing corresponding channel conversion processing on each image to be converted based on at least one channel of the color channel, the gray channel and the binarization channel to obtain an image to be marked; the image to be annotated comprises at least one of a color image, a gray image and a binarized image.
Optionally, the pre-labeling module 310 is specifically configured to, in pre-labeling the image to be labeled through a plurality of preset pre-labeling engines:
and respectively carrying out pre-labeling treatment on each image to be labeled through a preset first pre-labeling engine, a second pre-labeling engine and a third pre-labeling engine to obtain a plurality of pre-labeling results corresponding to each image to be labeled.
Optionally, the determining module 320 is specifically configured to, in determining the comprehensive confidence level of the pre-labeling result:
if all the pre-labeling results are characterized as consistency results, checking the consistency results based on a preset checking mode; the verification method comprises the following steps: a field style NLP check and/or a field validity regular check; the consistency result is that all the pre-labeling results are the same text data; and if the verification is successful, acquiring the comprehensive confidence coefficient.
Optionally, the consistency result is to obtain a unified character string; the determining module 320 is specifically configured to, in acquiring the integrated confidence level:
performing shift segmentation on the unified character string according to the number of the three characters to obtain a plurality of sub-character strings; calculating the confidence coefficient of each sub-character string; and determining the smallest confidence in the confidence of each substring as the comprehensive confidence.
Optionally, the second output module 340 is specifically configured to, in performing manual labeling processing on the image to be labeled based on the input labeling information:
clustering all the images to be marked according to a preset mode to obtain a plurality of image sets consisting of the same or similar images; assigning a corresponding image set for each labeling operator according to the operation information of the labeling operation object; wherein each image set is assigned for processing by at least two different labeling operators; the operation information includes a history labeling operation and/or a real-time labeling operation for labeling the operation object. And if the labeling information input by each labeling operator is consistent, outputting the labeling information.
Optionally, as shown in fig. 3b, the apparatus 300 further comprises a sample marking module 360, specifically configured to:
if all the pre-labeling results are characterized as non-consistency results, or if the comprehensive confidence coefficient is not greater than a preset threshold value, taking the image to be labeled as a sample image for training the pre-labeling engine, and setting the priority of the image to be labeled to be higher than that of other sample images so as to train the pre-labeling engine.
The apparatus of the embodiments of the present application may perform the method provided by the embodiments of the present application, and implementation principles of the method are similar, and actions performed by each module in the apparatus of each embodiment of the present application correspond to steps in the method of each embodiment of the present application, and detailed functional descriptions of each module of the apparatus may be referred to in the corresponding method shown in the foregoing, which is not repeated herein.
The embodiment of the application provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of an image labeling method, and compared with the related art, the steps of the image labeling method can be realized: the efficiency and the accuracy of image annotation are improved.
In an alternative embodiment, an electronic device is provided, as shown in fig. 4, the electronic device 4000 shown in fig. 4 includes: a processor 4001 and a memory 4003. Wherein the processor 4001 is coupled to the memory 4003, such as via a bus 4002. Optionally, the electronic device 4000 may further comprise a transceiver 4004, the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data, etc. It should be noted that, in practical applications, the transceiver 4004 is not limited to one, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.
The processor 4001 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor 4001 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.
Bus 4002 may include a path to transfer information between the aforementioned components. Bus 4002 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 4002 can be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 4, but not only one bus or one type of bus.
Memory 4003 may be, but is not limited to, ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, EEPROM (Electrically Erasable Programmable Read Only Memory ), CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer.
The memory 4003 is used for storing a computer program that executes an embodiment of the present application, and is controlled to be executed by the processor 4001. The processor 4001 is configured to execute a computer program stored in the memory 4003 to realize the steps shown in the foregoing method embodiment.
Among them, electronic devices include, but are not limited to: and a computer.
Embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, may implement the steps and corresponding content of the foregoing method embodiments.
The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and in the claims of this application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in other sequences than those illustrated or otherwise described.
It should be understood that, although the flowcharts of the embodiments of the present application indicate the respective operation steps by arrows, the order of implementation of these steps is not limited to the order indicated by the arrows. In some implementations of embodiments of the present application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages may be flexibly configured according to the requirement, which is not limited in the embodiment of the present application.
The foregoing is merely an optional implementation manner of the implementation scenario of the application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the application are adopted without departing from the technical ideas of the application, and also belong to the protection scope of the embodiments of the application.

Claims (9)

1. A method of image annotation, the method comprising:
performing pre-labeling treatment on the image to be labeled through a plurality of pre-labeling engines to obtain a plurality of pre-labeling results;
if the plurality of pre-labeling results are characterized as unified character strings, checking the unified character strings based on a preset checking mode; the verification mode comprises the following steps: a field style NLP check and/or a field validity regular check;
if the verification is successful, carrying out shift segmentation on the unified character string according to the unit quantity of three characters to obtain a plurality of sub-character strings;
for each sub-string, counting the error variance of the sub-string according to the history data of the sub-string; screening the minimum confidence coefficient from the confidence coefficient of the plurality of pre-labeling results, determining a first comparison result of the screened minimum confidence coefficient and a first confidence coefficient threshold value, a second confidence coefficient threshold value and a third confidence coefficient threshold value, and a second comparison result of the error variance and a variance check value, and processing the historical data of the sub-character strings according to the first comparison result and the second comparison result to obtain the confidence coefficient of the sub-character strings;
screening the minimum confidence coefficient from the confidence coefficient of each sub-character string, and taking the minimum confidence coefficient as the comprehensive confidence coefficient of the plurality of pre-labeling results;
outputting the unified character string if the comprehensive confidence coefficient is greater than a preset threshold value;
and if the comprehensive confidence coefficient is not greater than the preset threshold value, manually labeling the image to be labeled based on the input labeling information, and outputting the labeling information.
2. The method of claim 1, wherein prior to pre-labeling the image to be labeled by a plurality of pre-labeling engines, the method further comprises:
processing the first image based on a preset image correction mode to obtain a second image conforming to the labeling condition; the image correction mode comprises at least one of the following steps: a noise reduction mode, an angle correction mode and a distortion correction mode;
and processing the second image based on a preset target range and a plurality of preset channels to obtain the image to be annotated.
3. The method according to claim 2, wherein the processing the second image based on the preset target range and the plurality of preset channels to obtain the image to be annotated comprises:
selecting a target value from a preset target range and selecting a target interpolation mode;
scaling the second image based on the target value and the target interpolation mode to obtain an image to be converted including the second image;
performing corresponding channel conversion processing on each image to be converted based on at least one channel of a color channel, a gray channel and a binarization channel to obtain the image to be marked; the image to be annotated comprises at least one image of a color image, a gray level image and a binarized image.
4. The method according to claim 1, wherein the pre-labeling the image to be labeled by a plurality of pre-labeling engines comprises:
and respectively carrying out pre-labeling treatment on each image to be labeled through a preset first pre-labeling engine, a second pre-labeling engine and a third pre-labeling engine to obtain a plurality of pre-labeling results corresponding to each image to be labeled.
5. A method according to claim 3, wherein the manually labeling the image to be labeled based on the input labeling information, and outputting the labeling information, comprises:
clustering all the images to be marked according to a preset mode to obtain a plurality of image sets consisting of the same or similar images;
assigning a corresponding image set for each labeling operator according to the operation information of the labeling operation object; wherein each image set is assigned for processing by at least two different labeling operators; the operation information comprises historical labeling operation and/or real-time labeling operation of the labeling operation object;
and if the labeling information input by each labeling operator is consistent, outputting the labeling information.
6. The method according to claim 1, wherein the method further comprises:
and if the plurality of pre-labeling results are characterized as non-uniform results or the comprehensive confidence coefficient is not greater than a preset threshold value, taking the image to be labeled as a sample image for training the pre-labeling engine, and setting the priority of the image to be labeled higher than the priority of other sample images so as to train the pre-labeling engine.
7. An image annotation device, the device comprising:
the pre-marking module is used for pre-marking the image to be marked through a preset pre-marking engine to obtain a plurality of pre-marking results;
the determining module is used for checking the unified character string based on a preset checking mode if the plurality of pre-labeling results are characterized as the unified character string; the verification mode comprises the following steps: a field style NLP check and/or a field validity regular check; if the verification is successful, carrying out shift segmentation on the unified character string according to the unit quantity of three characters to obtain a plurality of sub-character strings; for each sub-string, counting the error variance of the sub-string according to the history data of the sub-string; screening the minimum confidence coefficient from the confidence coefficient of the plurality of pre-labeling results, determining a first comparison result of the screened minimum confidence coefficient and a first confidence coefficient threshold value, a second confidence coefficient threshold value and a third confidence coefficient threshold value, and a second comparison result of the error variance and a variance check value, and processing the historical data of the sub-character strings according to the first comparison result and the second comparison result to obtain the confidence coefficient of the sub-character strings; screening the minimum confidence coefficient from the confidence coefficient of each sub-character string, and taking the minimum confidence coefficient as the comprehensive confidence coefficient of the plurality of pre-labeling results;
the first output module is used for outputting the unified character string if the comprehensive confidence coefficient is larger than a preset threshold value;
and the second output module is used for carrying out manual labeling processing on the image to be labeled based on the input labeling information and outputting the labeling information if the comprehensive confidence coefficient is not greater than the preset threshold value.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps of the method of any of claims 1-6.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1-6.
CN202211223608.1A 2022-10-08 2022-10-08 Image labeling method, device, equipment and medium Active CN115620039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211223608.1A CN115620039B (en) 2022-10-08 2022-10-08 Image labeling method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211223608.1A CN115620039B (en) 2022-10-08 2022-10-08 Image labeling method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN115620039A CN115620039A (en) 2023-01-17
CN115620039B true CN115620039B (en) 2023-07-18

Family

ID=84861189

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211223608.1A Active CN115620039B (en) 2022-10-08 2022-10-08 Image labeling method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115620039B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912603B (en) * 2023-09-12 2023-12-15 浙江大华技术股份有限公司 Pre-labeling screening method, related device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095925B1 (en) * 2017-12-18 2018-10-09 Capital One Services, Llc Recognizing text in image data
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device
CN110889463A (en) * 2019-12-10 2020-03-17 北京奇艺世纪科技有限公司 Sample labeling method and device, server and machine-readable storage medium
CN111476210A (en) * 2020-05-11 2020-07-31 上海西井信息科技有限公司 Image-based text recognition method, system, device and storage medium
CN112861648A (en) * 2021-01-19 2021-05-28 平安科技(深圳)有限公司 Character recognition method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084289B (en) * 2019-04-11 2021-07-27 北京百度网讯科技有限公司 Image annotation method and device, electronic equipment and storage medium
CN111368902A (en) * 2020-02-28 2020-07-03 北京三快在线科技有限公司 Data labeling method and device
CN111833340B (en) * 2020-07-21 2024-03-26 阿波罗智能技术(北京)有限公司 Image detection method, device, electronic equipment and storage medium
CN112685584A (en) * 2021-03-22 2021-04-20 北京世纪好未来教育科技有限公司 Image content labeling method and device
CN113537184A (en) * 2021-06-03 2021-10-22 广州市新文溯科技有限公司 OCR (optical character recognition) model training method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095925B1 (en) * 2017-12-18 2018-10-09 Capital One Services, Llc Recognizing text in image data
CN110704661A (en) * 2019-10-12 2020-01-17 腾讯科技(深圳)有限公司 Image classification method and device
CN110889463A (en) * 2019-12-10 2020-03-17 北京奇艺世纪科技有限公司 Sample labeling method and device, server and machine-readable storage medium
CN111476210A (en) * 2020-05-11 2020-07-31 上海西井信息科技有限公司 Image-based text recognition method, system, device and storage medium
CN112861648A (en) * 2021-01-19 2021-05-28 平安科技(深圳)有限公司 Character recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115620039A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN111369545B (en) Edge defect detection method, device, model, equipment and readable storage medium
CN110175609B (en) Interface element detection method, device and equipment
CN111914654B (en) Text layout analysis method, device, equipment and medium
WO2023056723A1 (en) Fault diagnosis method and apparatus, and electronic device and storage medium
CN115620039B (en) Image labeling method, device, equipment and medium
CN111681228A (en) Flaw detection model, training method, detection method, apparatus, device, and medium
CN115273115A (en) Document element labeling method and device, electronic equipment and storage medium
CN113505781A (en) Target detection method and device, electronic equipment and readable storage medium
CN112235305A (en) Malicious traffic detection method based on convolutional neural network
CN112052702A (en) Method and device for identifying two-dimensional code
CN115171138A (en) Method, system and equipment for detecting image text of identity card
CN111753729B (en) False face detection method and device, electronic equipment and storage medium
CN112949653A (en) Text recognition method, electronic device and storage device
CN111709951B (en) Target detection network training method and system, network, device and medium
CN115205619A (en) Training method, detection method, device and storage medium for detection model
CN114708582A (en) AI and RPA-based intelligent electric power data inspection method and device
CN111582015A (en) Multi-parameter analysis platform utilizing cloud storage
CN109344836B (en) Character recognition method and equipment
CN113139617A (en) Power transmission line autonomous positioning method and device and terminal equipment
JP6175904B2 (en) Verification target extraction system, verification target extraction method, verification target extraction program
JP2001099625A (en) Device and method for pattern inspection
EP0585861A2 (en) Image verification method
Zheng et al. Recognition of expiry data on food packages based on improved DBNet
CN109376739B (en) Marshalling mode determining method and device
CN111597375B (en) Picture retrieval method based on similar picture group representative feature vector and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant