CN114550041B - Multi-target labeling method for shooting video by multiple cameras - Google Patents

Multi-target labeling method for shooting video by multiple cameras Download PDF

Info

Publication number
CN114550041B
CN114550041B CN202210152739.9A CN202210152739A CN114550041B CN 114550041 B CN114550041 B CN 114550041B CN 202210152739 A CN202210152739 A CN 202210152739A CN 114550041 B CN114550041 B CN 114550041B
Authority
CN
China
Prior art keywords
labeling
video data
target
video
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210152739.9A
Other languages
Chinese (zh)
Other versions
CN114550041A (en
Inventor
李向阳
张正
张兰
雷佳谕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202210152739.9A priority Critical patent/CN114550041B/en
Publication of CN114550041A publication Critical patent/CN114550041A/en
Application granted granted Critical
Publication of CN114550041B publication Critical patent/CN114550041B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Studio Devices (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-target labeling method for shooting videos by multiple cameras, which comprises the following steps: step 1, selecting two pieces of video data of the same area shot by front and rear cameras; step 2, time is leveled, and lens distortion is corrected; step 3, respectively selecting four fixed points in a common area in the pictures of the two video data at the same moment, and sequentially connecting the four fixed points into a convex quadrangle to serve as a labeling area; step 4, marking corresponding IDs in a marking area of one picture through a marking frame, outputting the positions of the ID marking targets on the other picture through a ReID model, and correcting the positions by marking personnel; step 5, repeating the step 4 to mark all targets; and step 6, switching to a picture with preset duration, operating a target tracking model, correcting an output result of the target tracking model by a labeling person, and operating the target tracking model to correct labeling information between two frames of pictures until multi-target labeling of the whole video data is completed. The method saves marking time and labor, and improves efficiency.

Description

Multi-target labeling method for shooting video by multiple cameras
Technical Field
The invention relates to the field of video data analysis, in particular to a multi-target labeling method of a region monitoring video.
Background
The existing video algorithms are very mature, but target detection and tracking for dense crowds are still difficult, and the main reason is the lack of effective and clean data to train the model. However, the existing labeling method has the problems of tedious labeling, serious shielding and the like in a region with dense personnel, such as a classroom. Meanwhile, detection of small targets (e.g., targets with small pixel areas and 32×32 pixels) has hitherto remained one of the difficulties in target detection, one of the important reasons being that small targets are unbalanced in terms of data sets. In the COCO data set, labeling of many small target objects is very difficult, and the objects are very small, and have the phenomena of shielding, blurring and the like with different degrees.
The existing video labeling method is concentrated on optimizing a target tracking method to save the clicking times of a labeling person, but is not suitable for small targets (such as targets with small pixel areas and 32×32 pixels) existing in a video, is difficult to find, and has the problems of high personnel operation times, labor consumption and low labeling efficiency.
In view of this, the present invention has been made.
Disclosure of Invention
The invention aims to provide a multi-target labeling method for shooting videos by multiple cameras, which can label multiple targets in an area monitoring video shot by the multiple cameras, remarkably reduce the operation times of personnel, save manpower and improve labeling efficiency, thereby solving the technical problems in the prior art.
The invention aims at realizing the following technical scheme:
the embodiment of the invention provides a multi-target labeling method for shooting videos by multiple cameras, which comprises the following steps:
step 1, selecting two pieces of video data of the same area shot by two cameras arranged in front and back from video data of the same area shot by a plurality of cameras as two pieces of video data to be marked, wherein the two pieces of video data are all video data containing a plurality of targets;
step 2, time of a front camera and a rear camera for shooting two sections of video data is aligned, and lens distortion of the two sections of video data is corrected through internal parameters of the front camera and the rear camera;
step 3, respectively selecting four fixed points in a common area in the pictures of the two video data at the same moment, and sequentially connecting the four fixed points into a convex quadrangle to serve as a labeling area;
step 4, in the labeling area of one picture of the video data processed in the step 3, labeling corresponding IDs of each target to be labeled through a labeling frame, outputting the positions of the targets labeled by the IDs in the labeling area of the picture of the other video data at the same moment through a ReID model, and correcting the positions output by the ReID model by labeling personnel according to the errors of ID matching or the offset of the labeling frame;
step 5, repeating the step 4 until all targets in the pictures of the same time of the two sections of video data are marked;
step 6, switching the two sections of video data to pictures after the preset time length, operating a target tracking model to track a target, correcting an output result of the target tracking model by a labeling person, and reversely operating the target tracking model to correct labeling information between two frames of pictures, so as to finish multi-target labeling of one section of video;
and (3) continuously switching the two sections of video data to a picture after the preset time period, and repeating the step (6) until the multi-target labeling of the whole video data is completed.
Compared with the prior art, the multi-target labeling method for shooting videos by using the multi-camera has the beneficial effects that:
four fixed points are respectively selected from the common region of the pictures at the same moment of the two pieces of marked video data and are sequentially connected to form a convex quadrangle to serve as an auxiliary marking region, after multiple targets in the picture marking region of one piece of video data are marked, multiple targets in the picture marking region at the same moment of the other piece of video data are marked through a ReID model, so that the number of times of personnel operation is obviously reduced, manpower is saved, and the marking efficiency is improved. The method is suitable for analysis and mining of video data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a multi-target labeling method for shooting video by a multi-camera according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a shooting area of a multi-target labeling method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of correspondence between multiple target positions in video data captured by front and rear cameras according to the multiple target labeling method provided by the embodiment of the present invention;
fig. 4 is a schematic perspective view of a shooting area of a multi-target labeling method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a process flow of a multi-objective labeling method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a video data frame of a multi-target labeling method according to an embodiment of the present invention; wherein, (a) and (b) are respectively the initial pictures of the target actions in the front camera and the rear camera, and (c) and (d) are respectively the final pictures of the target actions in the front camera and the rear camera.
Detailed Description
The technical scheme in the embodiment of the invention is clearly and completely described below in combination with the specific content of the invention; it will be apparent that the described embodiments are only some embodiments of the invention, but not all embodiments, which do not constitute limitations of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The terms that may be used herein will first be described as follows:
the term "and/or" is intended to mean that either or both may be implemented, e.g., X and/or Y are intended to include both the cases of "X" or "Y" and the cases of "X and Y".
The terms "comprises," "comprising," "includes," "including," "has," "having" or other similar referents are to be construed to cover a non-exclusive inclusion. For example: including a particular feature (e.g., a starting material, component, ingredient, carrier, formulation, material, dimension, part, means, mechanism, apparatus, step, procedure, method, reaction condition, processing condition, parameter, algorithm, signal, data, product or article of manufacture, etc.), should be construed as including not only a particular feature but also other features known in the art that are not explicitly recited.
The term "consisting of … …" is meant to exclude any technical feature element not explicitly listed. If such term is used in a claim, the term will cause the claim to be closed, such that it does not include technical features other than those specifically listed, except for conventional impurities associated therewith. If the term is intended to appear in only a clause of a claim, it is intended to limit only the elements explicitly recited in that clause, and the elements recited in other clauses are not excluded from the overall claim.
Unless specifically stated or limited otherwise, the terms "mounted," "connected," "secured," and the like should be construed broadly to include, for example: the connecting device can be fixedly connected, detachably connected or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the terms herein above will be understood by those of ordinary skill in the art as the case may be.
When concentrations, temperatures, pressures, dimensions, or other parameters are expressed as a range of values, the range is to be understood as specifically disclosing all ranges formed from any pair of upper and lower values within the range of values, regardless of whether ranges are explicitly recited; for example, if a numerical range of "2 to 8" is recited, that numerical range should be interpreted to include the ranges of "2 to 7", "2 to 6", "5 to 7", "3 to 4 and 6 to 7", "3 to 5 and 7", "2 and 5 to 7", and the like. Unless otherwise indicated, numerical ranges recited herein include both their endpoints and all integers and fractions within the numerical range.
The terms "center," "longitudinal," "transverse," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," etc. refer to an orientation or positional relationship based on that shown in the drawings, merely for ease of description and to simplify the description, and do not explicitly or implicitly indicate that the apparatus or element in question must have a particular orientation, be constructed and operated in a particular orientation, and therefore should not be construed as limiting the present disclosure.
The multi-target labeling method for shooting videos by using the multi-camera provided by the invention is described in detail below. What is not described in detail in the embodiments of the present invention belongs to the prior art known to those skilled in the art. The specific conditions are not noted in the examples of the present invention and are carried out according to the conditions conventional in the art or suggested by the manufacturer. The reagents or apparatus used in the examples of the present invention were conventional products commercially available without the manufacturer's knowledge.
As shown in fig. 1, an embodiment of the present invention provides a multi-target labeling method for capturing video by using multiple cameras, including:
step 1, selecting two pieces of video data of the same area shot by two cameras arranged in front and back from video data of the same area shot by a plurality of cameras as two pieces of video data to be marked, wherein the two pieces of video data are all video data containing a plurality of targets;
step 2, time of a front camera and a rear camera for shooting two sections of video data is aligned, and lens distortion of the two sections of video data is corrected through internal parameters of the front camera and the rear camera;
step 3, respectively selecting four fixed points in a common area in the pictures of the two video data at the same moment, and sequentially connecting the four fixed points into a convex quadrangle to serve as a labeling area;
step 4, in the labeling area of one picture of the video data processed in the step 3, labeling corresponding IDs of each target to be labeled through a labeling frame, outputting the positions of the targets labeled by the IDs in the labeling area of the picture of the other video data at the same moment through a ReID model, and correcting the positions of the targets output by the ReID model by labeling personnel according to the errors of ID matching or the offset of the labeling frame;
step 5, repeating the step 4 until all targets in the pictures of the same time of the two sections of video data are marked;
step 6, switching the two sections of video data to pictures after the preset time length, operating a target tracking model to track the target, correcting the output result of the target tracking model by a labeling person, then reversely putting the two sections of video, and operating the target tracking model again to correct labeling information between the two frames of pictures, thereby completing multi-target labeling of one section of video;
and (3) continuously switching the two sections of video data to a picture after the preset time period, and repeating the step (6) until the multi-target labeling of the whole video data is completed.
In the multi-target labeling method, the ReID model adopts a CLI model. The CLI model can utilize the information of a plurality of cameras to perform information fusion, so that target marking information under a plurality of view angles can be obtained efficiently.
In the multi-target labeling method, the target tracking model adopts a ByteTrack model.
In the above multi-target labeling method, in the step 6, the switching is performed to a preset duration of 3 seconds. And performing multi-target labeling processing on the whole video by taking 3 seconds as a period of time.
In summary, according to the multi-target labeling method provided by the embodiment of the invention, as the ReID model and the target tracking model are introduced, the two images of the two cameras are mutually assisted to be labeled, and the labeling fine adjustment of personnel is combined, so that the personnel operation times are obviously reduced, the manpower is saved, and the labeling efficiency is improved.
In order to clearly show the technical scheme and the technical effects, the multi-target labeling method for shooting video by the multi-camera provided by the embodiment of the invention is described in detail in the following.
Examples
As shown in fig. 2 and 6, an embodiment of the present invention provides a multi-target labeling method for capturing video by using multiple cameras, which is used for labeling video data of areas captured by multiple cameras, such as monitoring video data of public areas including classrooms, offices, etc. (see fig. 1):
step 1, selecting two pieces of video data of the same area shot by two cameras arranged in front and back from video data of the same area shot by a plurality of cameras (see fig. 3) as two pieces of video data to be marked, wherein the two pieces of video data are all video data containing a plurality of targets (see fig. 4);
step 2, loading two sections of video data to be marked into marking equipment, aligning the time of a front camera and a rear camera for shooting the two sections of video data, and inputting the front camera and the rear camera to internally correct lens distortion of the two sections of video data;
step 3, selecting four fixed points to be sequentially connected into convex quadrilaterals as labeling areas in the common areas of two pictures of two video data at the same time (see fig. 6 (a), (b), (c) and (d));
step 4, for each object to be marked, marking ID information on a frame of one section of video data (namely through a marking frame), outputting the position of the ID information in a frame marking area of the other section of video at the same moment by a ReID model, and performing fine adjustment by a marking person, wherein the specific fine adjustment mode is to correct the target position output by the ReID model by the marking person according to an ID matching error or marking frame deviation (see FIG. 5);
step 5, repeating the step 4 until all targets in the two pieces of video data are marked;
step 6, switching two sections of video data to pictures after a few seconds (generally preset to be 3 seconds), running a target tracking model to track a labeling target, correcting an output result of the target tracking model by a labeling person, and reversely running the target tracking model to correct labeling information between two frames of pictures of the two sections of video data, so as to finish multi-target labeling of one section of video;
and (3) continuously switching the two sections of video data to a picture after the preset time period, and repeating the step (6) until the multi-target labeling of the whole video data is completed.
In summary, the multi-target labeling method of the embodiment of the invention automatically completes the multi-target labeling process which is carried out manually in the past through the model due to the participation of various models, and the labeling personnel only carry out correction processing tasks, thereby fully reducing the workload of the labeling personnel, having great fault-tolerant space for the effect of the model, simultaneously and efficiently obtaining a plurality of accurately labeled video data and effectively reducing the labeling difficulty of the labeling personnel.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims. The information disclosed in the background section herein is only for enhancement of understanding of the general background of the invention and is not to be taken as an admission or any form of suggestion that this information forms the prior art already known to those of ordinary skill in the art.

Claims (4)

1. A multi-target labeling method for shooting video by a plurality of cameras is characterized by comprising the following steps:
step 1, selecting two pieces of video data of the same area shot by two cameras arranged in front and back from video data of the same area shot by a plurality of cameras as two pieces of video data to be marked, wherein the two pieces of video data are all video data containing a plurality of targets;
step 2, time of a front camera and a rear camera for shooting two sections of video data is aligned, and lens distortion of the two sections of video data is corrected through internal parameters of the front camera and the rear camera;
step 3, respectively selecting four fixed points in a common area in the pictures of the two video data at the same moment, and sequentially connecting the four fixed points into a convex quadrangle to serve as a labeling area;
step 4, in the labeling area of one picture of the video data processed in the step 3, labeling corresponding IDs of each target to be labeled through a labeling frame, outputting the positions of the targets labeled by the IDs in the labeling area of the picture of the other video data at the same moment through a ReID model, and correcting the positions output by the ReID model by labeling personnel according to the errors of ID matching or the offset of the labeling frame;
step 5, repeating the step 4 until all targets in the pictures of the same time of the two sections of video data are marked;
step 6, switching the two sections of video data to pictures after the preset time length, operating a target tracking model to track the target, correcting the output result of the target tracking model by a labeling person, then reversely putting the two sections of video, and operating the target tracking model again to correct labeling information between the two frames of pictures, thereby completing multi-target labeling of one section of video;
and (3) continuously switching the two sections of video data to a picture after the preset time period, and repeating the step (6) until the multi-target labeling of the whole video data is completed.
2. The multi-target labeling method for capturing video with multiple cameras according to claim 1, wherein the ReID model adopts a CLI model.
3. The multi-target labeling method for capturing video by using multiple cameras according to claim 1 or 2, wherein the target tracking model adopts a ByteTrack model.
4. The multi-target labeling method for capturing video by using multiple cameras according to claim 1 or 2, wherein in the step 6, the switching is performed to a preset duration of 3 seconds.
CN202210152739.9A 2022-02-18 2022-02-18 Multi-target labeling method for shooting video by multiple cameras Active CN114550041B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210152739.9A CN114550041B (en) 2022-02-18 2022-02-18 Multi-target labeling method for shooting video by multiple cameras

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210152739.9A CN114550041B (en) 2022-02-18 2022-02-18 Multi-target labeling method for shooting video by multiple cameras

Publications (2)

Publication Number Publication Date
CN114550041A CN114550041A (en) 2022-05-27
CN114550041B true CN114550041B (en) 2024-03-29

Family

ID=81675390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210152739.9A Active CN114550041B (en) 2022-02-18 2022-02-18 Multi-target labeling method for shooting video by multiple cameras

Country Status (1)

Country Link
CN (1) CN114550041B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109743497A (en) * 2018-12-21 2019-05-10 创新奇智(重庆)科技有限公司 A kind of dataset acquisition method, system and electronic device
CN110782484A (en) * 2019-10-25 2020-02-11 上海浦东临港智慧城市发展中心 Unmanned aerial vehicle video personnel identification and tracking method
WO2021196294A1 (en) * 2020-04-03 2021-10-07 中国科学院深圳先进技术研究院 Cross-video person location tracking method and system, and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109743497A (en) * 2018-12-21 2019-05-10 创新奇智(重庆)科技有限公司 A kind of dataset acquisition method, system and electronic device
CN110782484A (en) * 2019-10-25 2020-02-11 上海浦东临港智慧城市发展中心 Unmanned aerial vehicle video personnel identification and tracking method
WO2021196294A1 (en) * 2020-04-03 2021-10-07 中国科学院深圳先进技术研究院 Cross-video person location tracking method and system, and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
视频监控中多视角目标智能定位追踪方法研究;杜丽娟;路晓亚;;科学技术与工程;20170608(第16期);第270-274 页 *

Also Published As

Publication number Publication date
CN114550041A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN108760767B (en) Large-size liquid crystal display defect detection method based on machine vision
WO2021098081A1 (en) Trajectory feature alignment-based multispectral stereo camera self-calibration algorithm
CN107918927A (en) A kind of matching strategy fusion and the fast image splicing method of low error
CN103971352A (en) Rapid image splicing method based on wide-angle lenses
CN106780303A (en) A kind of image split-joint method based on local registration
CN105872345A (en) Full-frame electronic image stabilization method based on feature matching
CN109919007A (en) A method of generating infrared image markup information
CN107580186A (en) A kind of twin camera panoramic video joining method based on suture space and time optimization
CN105894443A (en) Method for splicing videos in real time based on SURF (Speeded UP Robust Features) algorithm
CN107462182A (en) A kind of cross section profile deformation detecting method based on machine vision and red line laser
CN109596054A (en) The size detection recognition methods of strip workpiece
CN114550041B (en) Multi-target labeling method for shooting video by multiple cameras
CN106550229A (en) A kind of parallel panorama camera array multi-view image bearing calibration
CN105701515A (en) Face super-resolution processing method and system based on double-layer manifold constraint
CN101272450B (en) Global motion estimation exterior point removing and kinematic parameter thinning method in Sprite code
CN111145220A (en) Tunnel target track tracking method based on visual information
CN108508022B (en) Multi-camera splicing imaging detection method
CN108489989B (en) Photovoltaic module double-sided appearance detector based on multi-camera splicing imaging detection
CN112308887B (en) Multi-source image sequence real-time registration method
CN106878628A (en) A kind of method that video-splicing is carried out by camera
CN106546196B (en) A kind of optical axis real-time calibration method and system
CN112465702A (en) Synchronous self-adaptive splicing display processing method for multi-channel ultrahigh-definition video
CN215587186U (en) Visual inspection equipment of product package
CN109887027A (en) A kind of method for positioning mobile robot based on image
CN112102301A (en) Stamping frame cooperative detection system and detection method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant