CN113763513A - Interactive marking method for target object in image - Google Patents
Interactive marking method for target object in image Download PDFInfo
- Publication number
- CN113763513A CN113763513A CN202110942463.XA CN202110942463A CN113763513A CN 113763513 A CN113763513 A CN 113763513A CN 202110942463 A CN202110942463 A CN 202110942463A CN 113763513 A CN113763513 A CN 113763513A
- Authority
- CN
- China
- Prior art keywords
- image
- marking
- client
- server
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 15
- 238000009877 rendering Methods 0.000 claims abstract description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000007906 compression Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 3
- 238000013144 data compression Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/80—Creating or modifying a manually drawn or painted image using a manual input device, e.g. mouse, light pen, direction keys on keyboard
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/463—Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Processing Or Creating Images (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
The invention discloses an interactive marking method for a target object in an image. The method comprises the following steps: loading image data of a client; storing the image to be marked at a server, and issuing the image to the outside in a service mode; when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; step two: rendering and marking the client side image; the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area; the client side carries out marking operation on the target object in the image in the marking area to form a marking result; step three: converting and storing the marked result coordinate; and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface. The invention has the advantages of realizing multi-person collaborative image marking in a network environment and simultaneously reducing the hardware performance requirement of the client.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to an interactive marking method for a target object in an image.
Background
Image tagging is a pre-processing process that assists in detecting image objects in which a user can click, box, or plot a particular object in an image so that the object can be further processed by a computer. The image marking tool can be used for creating a training data set and is widely applied in the fields of artificial intelligence and machine learning.
Currently, the widely used image labeling tools are LabelMe (http:// LabelMe. csail.mit. edu/Release3.0) and LabelImg (https:// github. com/tzutalin/labelImg). LabelImg supports positive rectangular labeling, and the labeling result is saved as an xml file in PASCAL VOC format by default. The LabelMe defaults to support polygonal marks and also supports marks of regular rectangles, points, lines and circles, and marking results are saved as json files by default.
Both tools need to download an installation package for local installation, and only local image data can be opened, so that multi-user cooperative marking is difficult to realize. Meanwhile, when the image data volume is large and the size of a single image is large, the local device needs a large storage space and a large memory to ensure smooth loading and smooth marking of the image.
Therefore, it is necessary to develop an image tagging method that can implement multi-user collaborative image tagging in a network environment and reduce the performance requirements of large-size and large-quantity image data on the hardware of the local device.
Disclosure of Invention
The invention aims to provide an interactive marking method of a target object in an image, which realizes multi-person collaborative image marking in a network environment, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirement of a client; the problem that a marking tool and image data of an existing image marking tool need to be installed or stored locally and multi-person cooperative marking is difficult to achieve is solved, and when the image data volume is large, images on equipment with low hardware configuration are difficult to load smoothly, and marking efficiency is affected is solved.
In order to achieve the purpose, the technical scheme of the invention is as follows: an interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,
the method comprises the following steps: loading image data of a client;
storing the image to be marked at a server, and issuing the image to the outside in a service mode; a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized;
when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; after compression, the data volume is reduced, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized;
step two: rendering and marking the client side image;
the client requests a picture needing to be marked (the picture is a compressed and first-time zoomed version of an original picture) from the server, and a zoomed image is rendered in a marked area;
after rendering is completed, the client carries out interactive marking operation on the target object in the image in the marking area to form a marking result;
step three: converting and storing the marked result coordinate;
the point coordinates forming the marking result of the client are local relative coordinates in the marking area and need to be converted into pixel coordinates of the original picture of the server,
converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, and storing the converted result into a database (as shown in fig. 1) by calling a server interface; and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.
In the above technical solution, in the second step, the interactive mark operation includes interactive clicking, frame selection, plotting, or the like.
In the above technical solution, in the second step, in the marking process, the picture may be translated or scaled with respect to the marked region.
In the above technical solution, as shown in fig. 2, the point coordinates forming the client marking result are converted into the pixel coordinates of the server original picture, and the specific method is as follows:
setting the first time scaling multiple of the original image as R, setting the second time scaling multiple in the marking area of the client as R, and setting the offset of the upper left corner of the image subjected to second time scaling relative to the upper left corner of the marking area as (delta x, delta y);
the relative coordinates of the points forming the marking result of the client in the marking area are (X)0,Y0) Pixel coordinate (x) of the corresponding point in the original picture of the server0,y0) The conversion relationship satisfies the following formula:
in formula (1), when R > 1, it means that the image is reduced in the mark region for the second time; when R < 1, the image is magnified in the mark area for the second time; when R is 1, the image is displayed in the original scale in the mark region for the second time.
The methods of compressing and scaling the original image at the server, rendering the scaled image in the marked area, calling the server interface, and storing the converted result in the database are all the prior art.
The invention has the following advantages:
1) a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized; the defect that in the prior art, a marking tool and image data need to be installed or stored locally, and multi-user cooperative marking is difficult to realize is overcome;
2) through image data compression and scaling of the server side, the image is transmitted to the client side in smaller data volume and size, and hardware requirements on the client side such as network bandwidth, storage capacity and memory size are effectively reduced; the defects that when the image data volume is large, images are difficult to smoothly load on equipment with low hardware configuration and the marking efficiency is influenced in the conventional image marking tool are overcome;
3) and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.
Drawings
Fig. 1 is a schematic diagram of the general technical principle of the present invention.
FIG. 2 is a schematic diagram of coordinate transformation of the marking result in the present invention.
Fig. 3 is a schematic diagram of an interactive mark of a target object in an image according to an embodiment of the present invention.
Fig. 3 is a diagram of fig. 3(1), fig. 3(2), and fig. 3(3) from left to right; FIG. 3(1) is an original picture according to the present embodiment; FIG. 3(2) is a compressed and scaled original image of the present embodiment; fig. 3(3) shows the picture and the mark area after the second scaling.
The light gray shading area on the outer circle of the picture in fig. 3(3) is the marked area in step two.
Detailed Description
The embodiments of the present invention will be described in detail with reference to the accompanying drawings, which are not intended to limit the present invention, but are merely exemplary. While the advantages of the invention will be clear and readily understood by the description.
The technical scheme provides an interactive marking method for a target object in an image, which realizes multi-user cooperative image marking in a network environment through steps of server image data compression and release, client image rendering and marking, marking result coordinate conversion and storage and the like, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirements of the client.
Examples
The present invention will now be described in detail with reference to an embodiment in which the present invention is applied to interactive tagging of a spherical lamp-like object in an image, and the present invention is also useful as a guidance for application to interactive tagging of objects in other images.
The effect of the experiment using the present solution based on the image data of a certain spherical lamp-shaped target object is shown in fig. 3.
Fig. 3(1) shows raw picture data obtained by the image capturing device, in BMP format, with a data size of 19496KB and a size of 5472 × 3648. It can be seen that when a single original picture has large data and a large amount, a large hard disk and a memory are required for the conventional local mark to ensure the storage and the rapid reading of the data.
Fig. 3(2) shows picture data compressed in jpg format and reduced by 10 times, the data size being 31KB and the size being 548 × 365. Therefore, after compression, the data volume is reduced by more than 600 times, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized.
Fig. 3(3) shows the effect of the image data being transmitted and displayed in the mark area of the client, in which the user can perform mouse wheel zooming and dragging translation on the image. The spherical lamp-shaped target is subjected to frame selection marking in the marking area by using a rectangular frame, and the marking result is shown as a solid line frame in fig. 3 (3). Coordinates of four corner points of the rectangular frame are transformed to restore the coordinates to a rectangular range in the original picture, and the final result is shown by a dotted line frame in fig. 3 (1).
In the embodiment, the marking result of the spherical lamp-shaped target object is accurately and inversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.
Other parts not described belong to the prior art.
Claims (4)
1. An interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,
the method comprises the following steps: loading image data of a client;
storing the image to be marked at a server, and issuing the image to the outside in a service mode;
when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture;
step two: rendering and marking the client side image;
the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area;
after rendering is completed, the client side carries out marking operation on the target object in the image in the marking area to form a marking result;
step three: converting and storing the marked result coordinate;
and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface.
2. The method of claim 1, wherein the method comprises: in the second step, the marking operation comprises interactive clicking, box selection and plotting.
3. The method of claim 2, wherein the method further comprises: in the second step, in the marking process, the picture is translated or zoomed relative to the marked area.
4. The method of claim 3, wherein the method further comprises: converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, wherein the specific method comprises the following steps:
setting the first time scaling multiple of the original image as R, setting the second time scaling multiple in the marking area of the client as R, and setting the offset of the upper left corner of the image subjected to second time scaling relative to the upper left corner of the marking area as (delta x, delta y);
the relative coordinates of the points forming the marking result of the client in the marking area are (X)0,Y0) Pixel coordinate (x) of the corresponding point in the original picture of the server0,y0) The conversion relationship satisfies the following formula:
in formula (1), when R > 1, it means that the image is reduced in the mark region for the second time; when R < 1, the image is magnified in the mark area for the second time; when R is 1, the image is displayed in the original scale in the mark region for the second time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110942463.XA CN113763513B (en) | 2021-08-17 | 2021-08-17 | Interactive marking method for target object in image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110942463.XA CN113763513B (en) | 2021-08-17 | 2021-08-17 | Interactive marking method for target object in image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113763513A true CN113763513A (en) | 2021-12-07 |
CN113763513B CN113763513B (en) | 2024-09-06 |
Family
ID=78790070
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110942463.XA Active CN113763513B (en) | 2021-08-17 | 2021-08-17 | Interactive marking method for target object in image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113763513B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070226314A1 (en) * | 2006-03-22 | 2007-09-27 | Sss Research Inc. | Server-based systems and methods for enabling interactive, collabortive thin- and no-client image-based applications |
CN102289829A (en) * | 2011-07-12 | 2011-12-21 | 北京朗玛数联科技有限公司 | Method and device for storing and restoring image and system for processing image |
CN102970331A (en) * | 2012-10-26 | 2013-03-13 | 北京奇虎科技有限公司 | Image providing system |
CN107273492A (en) * | 2017-06-15 | 2017-10-20 | 复旦大学 | A kind of exchange method based on mass-rent platform processes image labeling task |
CN108184097A (en) * | 2018-01-15 | 2018-06-19 | 浙江大学 | A kind of real time inspection method of image in tele-medicine |
CN108228816A (en) * | 2017-12-29 | 2018-06-29 | 北京奇虎科技有限公司 | A kind of loading method and device of waterfall flow graph piece |
CN108537129A (en) * | 2018-03-14 | 2018-09-14 | 北京影谱科技股份有限公司 | The mask method of training sample, device and system |
CN110135323A (en) * | 2019-05-09 | 2019-08-16 | 北京四维图新科技股份有限公司 | Image labeling method, device, system and storage medium |
EP3633990A1 (en) * | 2018-10-02 | 2020-04-08 | Nokia Technologies Oy | An apparatus, a method and a computer program for running a neural network |
CN112966772A (en) * | 2021-03-23 | 2021-06-15 | 之江实验室 | Multi-person online image semi-automatic labeling method and system |
-
2021
- 2021-08-17 CN CN202110942463.XA patent/CN113763513B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070226314A1 (en) * | 2006-03-22 | 2007-09-27 | Sss Research Inc. | Server-based systems and methods for enabling interactive, collabortive thin- and no-client image-based applications |
CN102289829A (en) * | 2011-07-12 | 2011-12-21 | 北京朗玛数联科技有限公司 | Method and device for storing and restoring image and system for processing image |
CN102970331A (en) * | 2012-10-26 | 2013-03-13 | 北京奇虎科技有限公司 | Image providing system |
CN107273492A (en) * | 2017-06-15 | 2017-10-20 | 复旦大学 | A kind of exchange method based on mass-rent platform processes image labeling task |
CN108228816A (en) * | 2017-12-29 | 2018-06-29 | 北京奇虎科技有限公司 | A kind of loading method and device of waterfall flow graph piece |
CN108184097A (en) * | 2018-01-15 | 2018-06-19 | 浙江大学 | A kind of real time inspection method of image in tele-medicine |
CN108537129A (en) * | 2018-03-14 | 2018-09-14 | 北京影谱科技股份有限公司 | The mask method of training sample, device and system |
EP3633990A1 (en) * | 2018-10-02 | 2020-04-08 | Nokia Technologies Oy | An apparatus, a method and a computer program for running a neural network |
CN110135323A (en) * | 2019-05-09 | 2019-08-16 | 北京四维图新科技股份有限公司 | Image labeling method, device, system and storage medium |
CN112966772A (en) * | 2021-03-23 | 2021-06-15 | 之江实验室 | Multi-person online image semi-automatic labeling method and system |
Non-Patent Citations (3)
Title |
---|
ANTONIO TORRALBA ET AL.: "LabelMe: online image annotation and applications", PROCEEDINGS OF THE IEEE, vol. 98, no. 08, 10 June 2010 (2010-06-10), pages 1467 - 1484 * |
徐志敏 等: "顾及水面比降的河道水面三维可视化方法", 长江科学院院报, vol. 36, no. 10, 15 October 2019 (2019-10-15), pages 19 - 22 * |
林寿山: "基于图像标注的在线协作系统的设计与实现", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 12, 15 December 2017 (2017-12-15), pages 139 - 204 * |
Also Published As
Publication number | Publication date |
---|---|
CN113763513B (en) | 2024-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111127422B (en) | Image labeling method, device, system and host | |
CN109543489B (en) | Positioning method and device based on two-dimensional code and storage medium | |
WO2021248705A1 (en) | Image rendering method and apparatus, computer program and readable medium | |
CN110796143A (en) | Scene text recognition method based on man-machine cooperation | |
CN100504923C (en) | Grain engine for grain video signal processing, graph processor and method | |
CN112037129A (en) | Image super-resolution reconstruction method, device, equipment and storage medium | |
CN113439227B (en) | Capturing and storing enlarged images | |
CN102096945B (en) | Progressive transmission of spatial data and device | |
CN113570733B (en) | Graphics rendering method, device and system based on WebGL | |
WO2015038307A1 (en) | Guided image upsampling using bitmap tracing | |
CN104657934A (en) | Image data processing method and device | |
WO2020228346A1 (en) | Transformer substation three-dimensional digital modeling method, system and device and storage medium | |
CN112927163A (en) | Image data enhancement method and device, electronic equipment and storage medium | |
CN115544289A (en) | Web end processing method of large PCB vector diagram | |
CN107038199B (en) | Drawing method and device | |
CN113506305B (en) | Image enhancement method, semantic segmentation method and device for three-dimensional point cloud data | |
US20150084961A1 (en) | Map performance by dynamically reducing map detail | |
CN101060642B (en) | Method and apparatus for generating 3d on screen display | |
CN113763513A (en) | Interactive marking method for target object in image | |
CN116912158A (en) | Workpiece quality inspection method, device, equipment and readable storage medium | |
CN116137683A (en) | System and method for generating a composite image of a training database | |
CN114549282B (en) | Method and system for realizing multi-meter reading based on affine transformation | |
CN111696154B (en) | Coordinate positioning method, device, equipment and storage medium | |
CN114494799A (en) | Data labeling method and device for target element, terminal equipment and computer readable storage medium | |
CN113963289A (en) | Target detection method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |