CN113763513A

CN113763513A - Interactive marking method for target object in image

Info

Publication number: CN113763513A
Application number: CN202110942463.XA
Authority: CN
Inventors: 罗兵; 徐志敏; 李圣田; 朱和平; 雷苏琪; 马能武; 黄祥虎; 陶蔚; 曹胜中; 何涛; 王炜; 曾志群; 尹强宏
Original assignee: Changjiang Spatial Information Technology Engineering Co ltd; Wan'an Hydropower Plant Of Jiangxi Electric Power Co Ltd Of National Energy Group
Current assignee: Changjiang Spatial Information Technology Engineering Co ltd; Wan'an Hydropower Plant Of Jiangxi Electric Power Co Ltd Of National Energy Group
Priority date: 2021-08-17
Filing date: 2021-08-17
Publication date: 2021-12-07
Anticipated expiration: 2041-08-17
Also published as: CN113763513B

Abstract

The invention discloses an interactive marking method for a target object in an image. The method comprises the following steps: loading image data of a client; storing the image to be marked at a server, and issuing the image to the outside in a service mode; when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; step two: rendering and marking the client side image; the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area; the client side carries out marking operation on the target object in the image in the marking area to form a marking result; step three: converting and storing the marked result coordinate; and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface. The invention has the advantages of realizing multi-person collaborative image marking in a network environment and simultaneously reducing the hardware performance requirement of the client.

Description

Interactive marking method for target object in image

Technical Field

The invention relates to the technical field of computer vision, in particular to an interactive marking method for a target object in an image.

Background

Image tagging is a pre-processing process that assists in detecting image objects in which a user can click, box, or plot a particular object in an image so that the object can be further processed by a computer. The image marking tool can be used for creating a training data set and is widely applied in the fields of artificial intelligence and machine learning.

Currently, the widely used image labeling tools are LabelMe (http:// LabelMe. csail.mit. edu/Release3.0) and LabelImg (https:// github. com/tzutalin/labelImg). LabelImg supports positive rectangular labeling, and the labeling result is saved as an xml file in PASCAL VOC format by default. The LabelMe defaults to support polygonal marks and also supports marks of regular rectangles, points, lines and circles, and marking results are saved as json files by default.

Both tools need to download an installation package for local installation, and only local image data can be opened, so that multi-user cooperative marking is difficult to realize. Meanwhile, when the image data volume is large and the size of a single image is large, the local device needs a large storage space and a large memory to ensure smooth loading and smooth marking of the image.

Therefore, it is necessary to develop an image tagging method that can implement multi-user collaborative image tagging in a network environment and reduce the performance requirements of large-size and large-quantity image data on the hardware of the local device.

Disclosure of Invention

The invention aims to provide an interactive marking method of a target object in an image, which realizes multi-person collaborative image marking in a network environment, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirement of a client; the problem that a marking tool and image data of an existing image marking tool need to be installed or stored locally and multi-person cooperative marking is difficult to achieve is solved, and when the image data volume is large, images on equipment with low hardware configuration are difficult to load smoothly, and marking efficiency is affected is solved.

In order to achieve the purpose, the technical scheme of the invention is as follows: an interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,

the method comprises the following steps: loading image data of a client;

storing the image to be marked at a server, and issuing the image to the outside in a service mode; a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized;

when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; after compression, the data volume is reduced, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized;

step two: rendering and marking the client side image;

the client requests a picture needing to be marked (the picture is a compressed and first-time zoomed version of an original picture) from the server, and a zoomed image is rendered in a marked area;

after rendering is completed, the client carries out interactive marking operation on the target object in the image in the marking area to form a marking result;

step three: converting and storing the marked result coordinate;

the point coordinates forming the marking result of the client are local relative coordinates in the marking area and need to be converted into pixel coordinates of the original picture of the server,

converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, and storing the converted result into a database (as shown in fig. 1) by calling a server interface; and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

In the above technical solution, in the second step, the interactive mark operation includes interactive clicking, frame selection, plotting, or the like.

In the above technical solution, in the second step, in the marking process, the picture may be translated or scaled with respect to the marked region.

In the above technical solution, as shown in fig. 2, the point coordinates forming the client marking result are converted into the pixel coordinates of the server original picture, and the specific method is as follows:

setting the first time scaling multiple of the original image as R, setting the second time scaling multiple in the marking area of the client as R, and setting the offset of the upper left corner of the image subjected to second time scaling relative to the upper left corner of the marking area as (delta x, delta y);

the relative coordinates of the points forming the marking result of the client in the marking area are (X)₀，Y₀) Pixel coordinate (x) of the corresponding point in the original picture of the server₀，y₀) The conversion relationship satisfies the following formula:

in formula (1), when R > 1, it means that the image is reduced in the mark region for the second time; when R < 1, the image is magnified in the mark area for the second time; when R is 1, the image is displayed in the original scale in the mark region for the second time.

The methods of compressing and scaling the original image at the server, rendering the scaled image in the marked area, calling the server interface, and storing the converted result in the database are all the prior art.

The invention has the following advantages:

1) a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized; the defect that in the prior art, a marking tool and image data need to be installed or stored locally, and multi-user cooperative marking is difficult to realize is overcome;

2) through image data compression and scaling of the server side, the image is transmitted to the client side in smaller data volume and size, and hardware requirements on the client side such as network bandwidth, storage capacity and memory size are effectively reduced; the defects that when the image data volume is large, images are difficult to smoothly load on equipment with low hardware configuration and the marking efficiency is influenced in the conventional image marking tool are overcome;

3) and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

Drawings

Fig. 1 is a schematic diagram of the general technical principle of the present invention.

FIG. 2 is a schematic diagram of coordinate transformation of the marking result in the present invention.

Fig. 3 is a schematic diagram of an interactive mark of a target object in an image according to an embodiment of the present invention.

Fig. 3 is a diagram of fig. 3(1), fig. 3(2), and fig. 3(3) from left to right; FIG. 3(1) is an original picture according to the present embodiment; FIG. 3(2) is a compressed and scaled original image of the present embodiment; fig. 3(3) shows the picture and the mark area after the second scaling.

The light gray shading area on the outer circle of the picture in fig. 3(3) is the marked area in step two.

Detailed Description

The embodiments of the present invention will be described in detail with reference to the accompanying drawings, which are not intended to limit the present invention, but are merely exemplary. While the advantages of the invention will be clear and readily understood by the description.

The technical scheme provides an interactive marking method for a target object in an image, which realizes multi-user cooperative image marking in a network environment through steps of server image data compression and release, client image rendering and marking, marking result coordinate conversion and storage and the like, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirements of the client.

Examples

The present invention will now be described in detail with reference to an embodiment in which the present invention is applied to interactive tagging of a spherical lamp-like object in an image, and the present invention is also useful as a guidance for application to interactive tagging of objects in other images.

The effect of the experiment using the present solution based on the image data of a certain spherical lamp-shaped target object is shown in fig. 3.

Fig. 3(1) shows raw picture data obtained by the image capturing device, in BMP format, with a data size of 19496KB and a size of 5472 × 3648. It can be seen that when a single original picture has large data and a large amount, a large hard disk and a memory are required for the conventional local mark to ensure the storage and the rapid reading of the data.

Fig. 3(2) shows picture data compressed in jpg format and reduced by 10 times, the data size being 31KB and the size being 548 × 365. Therefore, after compression, the data volume is reduced by more than 600 times, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized.

Fig. 3(3) shows the effect of the image data being transmitted and displayed in the mark area of the client, in which the user can perform mouse wheel zooming and dragging translation on the image. The spherical lamp-shaped target is subjected to frame selection marking in the marking area by using a rectangular frame, and the marking result is shown as a solid line frame in fig. 3 (3). Coordinates of four corner points of the rectangular frame are transformed to restore the coordinates to a rectangular range in the original picture, and the final result is shown by a dotted line frame in fig. 3 (1).

In the embodiment, the marking result of the spherical lamp-shaped target object is accurately and inversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

Other parts not described belong to the prior art.

Claims

1. An interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,

the method comprises the following steps: loading image data of a client;

storing the image to be marked at a server, and issuing the image to the outside in a service mode;

when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture;

step two: rendering and marking the client side image;

the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area;

after rendering is completed, the client side carries out marking operation on the target object in the image in the marking area to form a marking result;

step three: converting and storing the marked result coordinate;

and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface.

2. The method of claim 1, wherein the method comprises: in the second step, the marking operation comprises interactive clicking, box selection and plotting.

3. The method of claim 2, wherein the method further comprises: in the second step, in the marking process, the picture is translated or zoomed relative to the marked area.

4. The method of claim 3, wherein the method further comprises: converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, wherein the specific method comprises the following steps: