CN110659370A - Efficient data labeling method - Google Patents

Efficient data labeling method Download PDF

Info

Publication number
CN110659370A
CN110659370A CN201910738268.8A CN201910738268A CN110659370A CN 110659370 A CN110659370 A CN 110659370A CN 201910738268 A CN201910738268 A CN 201910738268A CN 110659370 A CN110659370 A CN 110659370A
Authority
CN
China
Prior art keywords
image
marking
labeling
marked
coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910738268.8A
Other languages
Chinese (zh)
Other versions
CN110659370B (en
Inventor
张欢
李爱林
周先得
张仕洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huafu Technology Co ltd
Original Assignee
Shenzhen Huafu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huafu Information Technology Co Ltd filed Critical Shenzhen Huafu Information Technology Co Ltd
Priority to CN201910738268.8A priority Critical patent/CN110659370B/en
Publication of CN110659370A publication Critical patent/CN110659370A/en
Application granted granted Critical
Publication of CN110659370B publication Critical patent/CN110659370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a high-efficiency data labeling method, belonging to the technical field of text data labeling, which comprises the following specific labeling steps: s1: transmitting the image to be marked to a data marking platform so that a marking system can process and mark the image by a marking person; s2: carrying out projection transformation on the image to enable the shape of the marked target to be close to a rectangle; s3: labeling by a rectangular labeling method; s4: carrying out coordinate inverse transformation; s5: obtaining the corresponding marking information of the original image: outputting the coordinates of the corresponding positions on the original image after inverse transformation in the step S4, and greatly reducing the labeling difficulty of all targets after one projection transformation when a single image sample contains a large number of targets to be labeled in the same direction (form) (for example, a bill sample contains a large number of text boxes in the same angular direction), thereby greatly improving the labeling speed; the marking precision is high: the obtained marking frame can be well attached to a target, and has small gaps and high precision.

Description

Efficient data labeling method
Technical Field
The invention relates to the technical field of text data labeling, in particular to a high-efficiency data labeling method.
Background
When the problem of target detection is solved, the position frame of the target to be detected is often required to be marked on the existing image data, and due to the angle problem of the camera and the shot object, the target with the original regular shape is deformed after imaging, so that a lot of difficulty is added to marking work, and the condition is particularly obvious on an OCR sample.
One of the existing methods is to adopt a rectangular marking method, namely, a target position is marked by uniformly using a rectangular marking frame, the method has the advantages that marking efficiency is high, only two points (a form of clicking- > dragging- > releasing by a mouse) are needed to finish marking of one target, and the method has the defect of low precision, and a large gap is left because an object cannot fill the whole rectangular frame after deformation.
The other method is to label the object by any quadrangle, and the method needs to select 4 points to finish the labeling of each target, and has the advantages of higher labeling precision and capability of obtaining higher-quality labeling data under the condition of serious labeling by a labeling person. However, the method has the disadvantages that the marking work intensity is high (four points are needed), errors are easy to occur, and the position of a certain point is slightly deviated during actual operation, so that the whole quadrangle is greatly deformed, and frequent modification is caused.
Disclosure of Invention
The invention aims to provide an efficient data labeling method to solve the problems that the precision is not high, the whole rectangular frame cannot be filled with objects after the objects are deformed, a large gap is left, the working strength is high, and errors are easy to occur.
In order to achieve the purpose, the invention provides the following technical scheme: a high-efficiency data labeling method comprises the following specific labeling steps:
s1: putting an image to be marked: transmitting the image to be marked to a data marking platform so that a marking system can process and mark the image by a marking person;
s2: performing projection transformation on the image to enable the shape of the labeling target to be close to a rectangle: establishing a planar rectangular coordinate system by taking the left side and the upper side of the display area as a Y axis and an X axis, taking the intersection point of the Y axis and the X axis as an origin, performing projection transformation processing on the image to be labeled put in the step S1 to enable the target to be labeled in the image to be labeled to be close to a rectangular shape, and placing the target to be labeled close to the rectangular shape in the middle of the vision;
s3: labeling by a rectangular labeling method: after the image to be marked is projected and transformed to be horizontal, the image to be marked can be conveniently marked by a rectangular marking method, and a rectangular marking frame can be obtained only by selecting two points, namely the upper left point and the lower right point, of the rectangle;
s4: coordinate inverse transformation: the coordinates obtained by marking in the second step can be regarded as the coordinates after projection transformation, and the coordinates are inversely transformed by utilizing the projection matrix obtained in the previous step, so that the coordinates of the corresponding position on the original image can be obtained;
s5: obtaining the corresponding marking information of the original image: and outputting the coordinates of the corresponding position on the original image subjected to inverse transformation in the step S4, so as to obtain the corresponding annotation information of the original image.
Preferably, the projective transformation processing modes include projective transformation processing modes such as rotation, inversion, translation, and scaling.
Preferably, the rotational projective transformation processing mode is divided into three parts, the first part is to translate the center of the image to the origin, the second part is to rotate at an angle θ, and the third part is to translate the center of the image back.
Preferably, the reversed projective transformation processing mode specifically includes: the control image is folded and turned over with respect to any straight line in the display area.
Preferably, the translational projective transformation processing mode is specifically: and translating the image center to the original point, then moving the image center and driving the image to move transversely and longitudinally, wherein the transverse moving length and the longitudinal moving length of the image are respectively half of the transverse length and half of the longitudinal length of the display area.
Preferably, the scaling projective transformation processing mode specifically includes: and selecting the central point of the display area as a zooming point, and zooming the image according to the N times of proportion.
Compared with the prior art, the invention has the beneficial effects that:
1) and (4) marking time saving: under the condition that a single image sample contains a large number of targets to be marked in the same direction (form) (for example, a bill sample contains a large number of text boxes in the same angle direction), the marking difficulty of all targets is greatly reduced after one-time projection transformation, and the marking speed is greatly improved;
2) the marking precision is high: the obtained marking frame can be well attached to a target, and has small gaps and high precision.
Drawings
FIG. 1 is a flow chart of the labeling method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "front", "rear", "left", "right", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention.
Referring to fig. 1, the present invention provides a technical solution: a high-efficiency data labeling method comprises the following specific labeling steps:
s1: putting an image to be marked: the image to be marked is transmitted to a data marking platform so that a marking system can process and mark personnel conveniently, the image needing data marking is usually manually placed in an identification area, and the image needing data marking is acquired through a camera;
s2: performing projection transformation on the image to enable the shape of the labeling target to be close to a rectangle: establishing a planar rectangular coordinate system by taking the left side and the upper side of the display area as a Y axis and an X axis, taking the intersection point of the Y axis and the X axis as an origin, performing projection transformation processing on the image to be marked put in the step S1 to enable the target to be marked in the image to be marked to be close to a rectangular shape, placing the target to be marked close to the rectangular shape in the middle of the vision, and covering the rectification display area by the established coordinate system;
s3: labeling by a rectangular labeling method: after the image to be marked is projected and transformed to be horizontal, the image to be marked can be conveniently marked by a rectangular marking method, a rectangular marking frame can be obtained by selecting two points, namely the upper left point and the lower right point, of the rectangle, and the area of the rectangular marking frame covers the whole marking target;
s4: coordinate inverse transformation: the coordinates obtained by marking in the second step can be regarded as the coordinates after projection transformation, and the coordinates are inversely transformed by utilizing the projection matrix obtained in the previous step, so that the coordinates of the corresponding position on the original image can be obtained;
s5: obtaining the corresponding marking information of the original image: and outputting the coordinates of the corresponding position on the original image subjected to inverse transformation in the step S4, so as to obtain the corresponding annotation information of the original image.
The projection transformation processing mode comprises rotation, turnover, translation, scaling and other projection transformation processing modes, and a single projection transformation processing mode of rotation, turnover, translation and scaling or a combined operation mode of two or more of the projection transformation processing modes is selected according to different use conditions.
The rotating projection transformation processing mode is divided into three parts, the first part is that the center of an image is translated to an original point, the center of the image is taken as a reference point to drive the whole image to move simultaneously, the second part rotates at an angle theta, the specific value of the angle theta is selected according to the specific image, the image is rotated, so that the target to be marked in the image can reach a state close to straight through rotation, and the third part is that the center of the image is translated back, so that the target to be marked in the image can be displayed in the middle of a display area in a straight mode.
The turning projection transformation processing mode specifically comprises the following steps: the regulating image is folded and turned over about any straight line in the display area, and the turning turns the image about any straight line in the display area by 180 degrees, so that the image is turned over about the straight line and the turned image information is displayed.
The translation projection transformation processing mode specifically comprises the following steps: the image center is translated to the original point, then the image center is moved and the image is driven to move transversely and longitudinally, the transverse moving length and the longitudinal moving length of the image are respectively half of the transverse length and half of the longitudinal length of the display area, and when an object to be marked displayed by the image is located at the edge of the display area and is inconvenient to see clearly or difficult to identify, the image is adjusted to the middle of the display area to facilitate identification.
The scaling projective transformation processing mode specifically comprises: the central point of the selected display area is a zooming point, the image is zoomed according to the proportion of N times, when the proportion of the target to be marked relative to the display area is small, the identification is inaccurate, the identification effect is difficult to perform, the image needs to be amplified, and the proportion of the target to be marked and the display area is moderate.
Example (b):
exemplified by a rotational-translational transformation. A rotating text is arranged in a following image bill sample, and the rotating text is inconvenient to label directly, and the rotating is carried out by using the image center until the text is in the horizontal direction (manually completed by a labeling person). The three parts of the rotating projective transformation processing mode are embodied in three transformation matrixes (from right to left) in the following formula, wherein width and height are the length and width of the image, and x 'and y' are transformed coordinates.
After the transformation relation is determined, the determination of each pixel value of the transformed image can be obtained by mapping the coordinates corresponding to the original image reversely and then calculating by interpolation.
It is sometimes not sufficient to adjust the object to a horizontal rectangular shape by just rotation, translation transformation, and it is necessary to rely on other kinds of projective transformation or combinations thereof, but all projective transformations and combinations of projective transformations can be expressed by the following formulas. The intermediate projective transformation matrix is often a multiplication of multiple transformation matrices.
Figure BDA0002163007140000052
In the actual implementation of the labeling system, the projection matrix can be decomposed into a plurality of different transformation matrices, and different transformation parameters can be set separately, so that labeling personnel can conveniently perform corresponding operations in the labeling system.
After the image to be marked is rotated to be horizontal, the image can be marked conveniently by a rectangular marking method, and a rectangular marking frame (generally, coordinate information of four vertexes of a rectangle can be stored) can be obtained only by selecting two points, namely the upper left point and the lower right point, of the rectangle.
The coordinates obtained by labeling in the second step can be regarded as the coordinates after projection transformation, and the coordinates are inversely transformed by using the projection matrix obtained before, so that the coordinates (coordinate information of four vertexes of the quadrangle) of the corresponding position on the original image can be obtained.
Figure BDA0002163007140000061
Therefore, more accurate target frame position information on the original image can be obtained.
While there have been shown and described the fundamental principles and essential features of the invention and advantages thereof, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing exemplary embodiments, but is capable of other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein, and any reference signs in the claims are not intended to be construed as limiting the claim concerned.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. An efficient data annotation method is characterized in that: the efficient data labeling method comprises the following specific labeling steps:
s1: putting an image to be marked: transmitting the image to be marked to a data marking platform so that a marking system can process and mark the image by a marking person;
s2: performing projection transformation on the image to enable the shape of the labeling target to be close to a rectangle: establishing a planar rectangular coordinate system by taking the left side and the upper side of the display area as a Y axis and an X axis, taking the intersection point of the Y axis and the X axis as an origin, performing projection transformation processing on the image to be labeled put in the step S1 to enable the target to be labeled in the image to be labeled to be close to a rectangular shape, and placing the target to be labeled close to the rectangular shape in the middle of the vision;
s3: labeling by a rectangular labeling method: after the image to be marked is projected and transformed to be horizontal, the image to be marked can be conveniently marked by a rectangular marking method, and a rectangular marking frame can be obtained only by selecting two points, namely the upper left point and the lower right point, of the rectangle;
s4: coordinate inverse transformation: the coordinates obtained by marking in the second step can be regarded as the coordinates after projection transformation, and the coordinates are inversely transformed by utilizing the projection matrix obtained in the previous step, so that the coordinates of the corresponding position on the original image can be obtained;
s5: obtaining the corresponding marking information of the original image: and outputting the coordinates of the corresponding position on the original image subjected to inverse transformation in the step S4, so as to obtain the corresponding annotation information of the original image.
2. The efficient data annotation method of claim 1, wherein: the projection transformation processing modes comprise rotation, turnover, translation, scaling and other projection transformation processing modes.
3. The efficient data annotation method of claim 2, wherein: the rotating projective transformation processing mode is divided into three parts, wherein the first part is that the center of the image is translated to the origin, the second part is rotated by the angle theta, and the third part is that the center of the image is translated back.
4. The efficient data annotation method of claim 2, wherein: the turning projection transformation processing mode specifically comprises the following steps: the control image is folded and turned over with respect to any straight line in the display area.
5. The efficient data annotation method of claim 2, wherein: the translation projection transformation processing mode specifically comprises the following steps: and translating the image center to the original point, then moving the image center and driving the image to move transversely and longitudinally, wherein the transverse moving length and the longitudinal moving length of the image are respectively half of the transverse length and half of the longitudinal length of the display area.
6. The efficient data annotation method of claim 2, wherein: the scaling projective transformation processing mode specifically comprises: and selecting the central point of the display area as a zooming point, and zooming the image according to the N times of proportion.
CN201910738268.8A 2019-08-12 2019-08-12 Efficient data labeling method Active CN110659370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910738268.8A CN110659370B (en) 2019-08-12 2019-08-12 Efficient data labeling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910738268.8A CN110659370B (en) 2019-08-12 2019-08-12 Efficient data labeling method

Publications (2)

Publication Number Publication Date
CN110659370A true CN110659370A (en) 2020-01-07
CN110659370B CN110659370B (en) 2024-04-02

Family

ID=69036502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910738268.8A Active CN110659370B (en) 2019-08-12 2019-08-12 Efficient data labeling method

Country Status (1)

Country Link
CN (1) CN110659370B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114387598A (en) * 2021-12-02 2022-04-22 北京云测信息技术有限公司 Document labeling method and device, electronic equipment and storage medium
CN114531580A (en) * 2020-11-23 2022-05-24 北京四维图新科技股份有限公司 Image processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120201423A1 (en) * 2009-10-20 2012-08-09 Rakuten, Inc. Image processing apparatus, image processing method, image processing program and recording medium
CN108573279A (en) * 2018-03-19 2018-09-25 精锐视觉智能科技(深圳)有限公司 Image labeling method and terminal device
CN109978955A (en) * 2019-03-11 2019-07-05 武汉环宇智行科技有限公司 A kind of efficient mask method for combining laser point cloud and image
CN110097054A (en) * 2019-04-29 2019-08-06 济南浪潮高新科技投资发展有限公司 A kind of text image method for correcting error based on image projection transformation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120201423A1 (en) * 2009-10-20 2012-08-09 Rakuten, Inc. Image processing apparatus, image processing method, image processing program and recording medium
CN108573279A (en) * 2018-03-19 2018-09-25 精锐视觉智能科技(深圳)有限公司 Image labeling method and terminal device
CN109978955A (en) * 2019-03-11 2019-07-05 武汉环宇智行科技有限公司 A kind of efficient mask method for combining laser point cloud and image
CN110097054A (en) * 2019-04-29 2019-08-06 济南浪潮高新科技投资发展有限公司 A kind of text image method for correcting error based on image projection transformation

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114531580A (en) * 2020-11-23 2022-05-24 北京四维图新科技股份有限公司 Image processing method and device
CN114531580B (en) * 2020-11-23 2023-11-21 北京四维图新科技股份有限公司 Image processing method and device
CN114387598A (en) * 2021-12-02 2022-04-22 北京云测信息技术有限公司 Document labeling method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110659370B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
US20220046218A1 (en) Disparity image stitching and visualization method based on multiple pairs of binocular cameras
CN100583151C (en) Double-camera calibrating method in three-dimensional scanning system
CN106650682B (en) Face tracking method and device
EP2202686B1 (en) Video camera calibration method and device thereof
CN110246124B (en) Target size measuring method and system based on deep learning
CN109272574B (en) Construction method and calibration method of linear array rotary scanning camera imaging model based on projection transformation
US9195121B2 (en) Markerless geometric registration of multiple projectors on extruded surfaces using an uncalibrated camera
CN101673397B (en) Digital camera nonlinear calibration method based on LCDs
CN110580723A (en) method for carrying out accurate positioning by utilizing deep learning and computer vision
US20120294537A1 (en) System for using image alignment to map objects across disparate images
CN101783018B (en) Method for calibrating camera by utilizing concentric circles
CN102446355B (en) Method for detecting target protruding from plane based on double viewing fields without calibration
CN110659370B (en) Efficient data labeling method
CN113379668B (en) Photovoltaic panel splicing method and device, electronic equipment and storage medium
CN105118086A (en) 3D point cloud data registering method and system in 3D-AOI device
CN114371472B (en) Automatic combined calibration device and method for laser radar and camera
CN106780308B (en) Image perspective transformation method
CN112948605B (en) Point cloud data labeling method, device, equipment and readable storage medium
CN105643265A (en) Detecting method for matching of mounting surfaces of large workpieces
JP2000171214A (en) Corresponding point retrieving method and three- dimensional position measuring method utilizing same
CN114792343B (en) Calibration method of image acquisition equipment, method and device for acquiring image data
CN106022333A (en) Vehicle license plate tilt image correcting method
CN111914856B (en) Layout method, device and system for plate excess material, electronic equipment and storage medium
CN115346041A (en) Point position marking method, device and equipment based on deep learning and storage medium
CN109242910A (en) A kind of monocular camera self-calibrating method based on any known flat shape

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee after: Shenzhen Huafu Technology Co.,Ltd.

Country or region after: China

Address before: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Patentee before: SHENZHEN HUAFU INFORMATION TECHNOLOGY Co.,Ltd.

Country or region before: China