CN113034551A - Target tracking and labeling method and device, readable storage medium and computer equipment - Google Patents

Target tracking and labeling method and device, readable storage medium and computer equipment Download PDF

Info

Publication number
CN113034551A
CN113034551A CN202110604174.9A CN202110604174A CN113034551A CN 113034551 A CN113034551 A CN 113034551A CN 202110604174 A CN202110604174 A CN 202110604174A CN 113034551 A CN113034551 A CN 113034551A
Authority
CN
China
Prior art keywords
target
tracking
mask
outline
position attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110604174.9A
Other languages
Chinese (zh)
Inventor
毛凤辉
郭振民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanchang Virtual Reality Institute Co Ltd
Original Assignee
Nanchang Virtual Reality Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanchang Virtual Reality Institute Co Ltd filed Critical Nanchang Virtual Reality Institute Co Ltd
Priority to CN202110604174.9A priority Critical patent/CN113034551A/en
Publication of CN113034551A publication Critical patent/CN113034551A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Abstract

The invention discloses a target tracking and labeling method, a target tracking and labeling device, a readable storage medium and computer equipment, wherein the method comprises the following steps: acquiring a target object to be tracked and labeled from a video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask; screening a target mask from the tracking masks, and obtaining the contour position attribute of the target mask; and converting the contour position attribute of the target mask into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof. The invention can solve the problems of poor tracking effect and low labeling efficiency when the prior art tracks and labels the video target.

Description

Target tracking and labeling method and device, readable storage medium and computer equipment
Technical Field
The invention relates to the technical field of computers, in particular to a target tracking and labeling method, a target tracking and labeling device, a readable storage medium and computer equipment.
Background
Target detection is one of important technologies for realizing VR (Virtual Reality) human-computer interaction, in the target detection, a labeling process of a target object is very important, and the target detection is used for marking the position of the target object in an original image and generating a corresponding file for each picture to represent the position of a target standard frame.
In the prior art, labeling tools commonly used for target detection comprise labelImg, labelme, Darklabel and the like, labeling of a single-frame image is convenient for labelImg and labelme, but support of a video is not friendly, the video needs to be divided into one frame of image and then labeled, and if the data size is large, time is consumed; the Darklabel can track and label the video target, but the tracking effect on the target is poor, the tracking frame needs to be continuously adjusted, and the labeling efficiency is low.
Disclosure of Invention
Therefore, an object of the present invention is to provide a target tracking and labeling method, so as to solve the problems of poor tracking effect and low labeling efficiency when a video target is tracked and labeled in the prior art.
The invention provides a target tracking and labeling method, which comprises the following steps:
acquiring a target object to be tracked and labeled from a video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask;
screening a target mask from the tracking masks, and obtaining the contour position attribute of the target mask;
and converting the contour position attribute of the target mask into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
According to the target tracking and labeling method provided by the invention, the siammask algorithm is combined with video labeling, the siammask algorithm with a better tracking effect is used for target tracking, the anti-interference performance is strong, the target object is tracked through the siammask target tracking algorithm, the target mask is screened out from the tracking mask after the tracking mask is obtained, the contour position attribute of the target mask is obtained, and then the contour position attribute of the target mask is converted into a format required by yolo series target detection, namely, the target object in the video is labeled, so that the automatic target tracking and labeling are completed, and the labeling efficiency is improved.
In addition, according to the above-mentioned target tracking and labeling method of the present invention, the following additional technical features may also be provided:
further, the step of acquiring a target object to be tracked and labeled from the video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask specifically includes:
selecting a target object in a frame of a video containing the target object to be tracked and labeled;
tracking the target object by using a siammask target tracking algorithm;
and acquiring a tracking mask obtained when the target object is tracked by a siammask target tracking algorithm.
Further, the step of screening out a target mask from the tracking masks and obtaining the contour position attribute of the target mask specifically includes:
searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
acquiring the position attribute and the area of each outline in the outline list through a bounngselect () function in python library opencv;
and acquiring the outline with the largest area, taking the outline with the largest area as the outline of the target mask, and taking the position attribute of the outline with the largest area as the outline position attribute of the target mask.
Further, the step of converting the contour position attribute of the target mask into a format required for the detection of the yolo series target to obtain a yolo series target detection tag file and a corresponding image thereof specifically includes:
calculating the proportion of the target object relative to the size of the original image according to the contour position attribute of the target mask;
and converting the position of the target object relative to the size of the image into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
Further, in the step of calculating the ratio of the target object to the original image size according to the contour position attribute of the target mask, the ratio of the target object to the original image size is calculated by using the following formula:
Figure 654945DEST_PATH_IMAGE001
wherein the content of the first and second substances,WHrepresenting width, height, x of the original imagek、ykRepresents the coordinate position, width, of the top left corner vertex of the contour k with the largest areak、heightkWidth and height of the profile k representing the largest area, pkx、pkyThe ratio of the center position of the minimum rectangular bounding box representing the contour k of the largest area to the original image size, pkw、pkhThe contour k representing the maximum contour is the ratio of the width and height of the minimum rectangular bounding box to the original image size.
Further, after the step of converting the contour position attribute of the target mask into a format required for the yolo series target detection to obtain a yolo series target detection tag file and a corresponding image thereof, the method further includes:
storing the yolo series target detection label file and the corresponding image thereof in a preset folder;
dividing the marked files in the preset folder into a training set, a verification set and a test set according to a preset division ratio;
and performing model training on the neural network model of the yolo series through the training set, the verification set and the test set, wherein the trained neural network training model is used for target detection.
The invention also aims to provide a target tracking and labeling device to solve the problems of poor tracking effect and low labeling efficiency when a video target is tracked and labeled in the prior art.
The invention provides a target tracking and labeling device, which comprises:
the acquisition tracking module is used for acquiring a target object to be tracked and labeled from a video and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask;
the screening and obtaining module is used for screening a target mask from the tracking masks and obtaining the outline position attribute of the target mask;
and the conversion generation module is used for converting the contour position attribute of the target mask into a format required by the yolo series target detection so as to obtain a yolo series target detection label file and a corresponding image thereof.
According to the target tracking and labeling device provided by the invention, the siammask algorithm is combined with video labeling, the siammask algorithm with a better tracking effect is used for target tracking, the anti-interference performance is strong, the target object is tracked through the siammask target tracking algorithm, the target mask is screened out from the tracking mask after the tracking mask is obtained, the contour position attribute of the target mask is obtained, and then the contour position attribute of the target mask is converted into a format required by yolo series target detection, namely, the target object in the video is labeled, so that the automatic target tracking and labeling are completed, and the labeling efficiency is improved.
In addition, the above target tracking and labeling device according to the present invention may further have the following additional technical features:
further, the acquisition tracking module is specifically configured to:
selecting a target object in a frame of a video containing the target object to be tracked and labeled;
tracking the target object by using a siammask target tracking algorithm;
and acquiring a tracking mask obtained when the target object is tracked by a siammask target tracking algorithm.
Further, the screening acquisition module is specifically configured to:
searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
acquiring the position attribute and the area of each outline in the outline list through a bounngselect () function in python library opencv;
and acquiring the outline with the largest area, taking the outline with the largest area as the outline of the target mask, and taking the position attribute of the outline with the largest area as the outline position attribute of the target mask.
Further, the conversion generation module is specifically configured to:
calculating the proportion of the target object relative to the size of the original image according to the contour position attribute of the target mask;
and converting the position of the target object relative to the size of the image into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
Further, the conversion generation module is specifically configured to calculate a ratio of the target object to the original image size using the following formula:
Figure 211828DEST_PATH_IMAGE002
wherein the content of the first and second substances,WHrepresenting width, height, x of the original imagek、ykRepresents the coordinate position, width, of the top left corner vertex of the contour k with the largest areak、heightkWidth and height of the profile k representing the largest area, pkx、pkyThe ratio of the center position of the minimum rectangular bounding box representing the contour k of the largest area to the original image size, pkw、pkhThe contour k representing the maximum contour is the ratio of the width and height of the minimum rectangular bounding box to the original image size.
Further, the apparatus further comprises:
the storage module is used for storing the yolo series target detection label file and the corresponding image thereof in a preset folder;
the dividing module is used for dividing the marked files in the preset folder into a training set, a verification set and a test set according to a preset dividing proportion;
and the training module is used for carrying out model training on the neural network model of the yolo series through the training set, the verification set and the test set, and the trained neural network training model is used for target detection.
The invention also proposes a readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method.
The invention also proposes a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of embodiments of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow diagram of a target tracking and labeling method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of human hand tracking;
FIG. 3 is a detailed flowchart of step S101 in FIG. 1;
FIG. 4 is a schematic diagram of a human hand tracking mask;
FIG. 5 is a detailed flowchart of step S102 in FIG. 1;
FIG. 6 is a detailed flowchart of step S103 in FIG. 1;
FIG. 7 is a flow diagram of a target tracking and labeling method according to another embodiment of the present invention;
FIG. 8 is a block diagram of a target tracking and labeling apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a target tracking and labeling method according to an embodiment of the present invention includes steps S101 to S103.
S101, acquiring a target object to be tracked and labeled from a video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask.
Referring to fig. 2, for example, a hand is tracked. In addition, referring to fig. 3, step S101 specifically includes:
s1011, selecting the target object in the frame of the video containing the target object to be tracked and labeled.
And S1012, tracking the target object by using a siammask target tracking algorithm.
The siammask is an existing open-source mature target tracking technology, and the target object selected in the step S1011 is tracked through a siammask target tracking algorithm.
And S1013, acquiring a tracking mask obtained when the target object is tracked through a siammask target tracking algorithm.
It should be noted that, at this time, the tracked tracking mask may further include some interference regions, for example, as shown in fig. 4, the target object is a hand of a person, and the tracked tracking mask may further include some other interference regions (oval regions in fig. 4) in addition to the hand region.
In the specific implementation, in the siammask tracking process, if tracking failure or inaccuracy occurs (manual supervision is needed), the tracking target can be selected again, and the tracker is initialized to realize re-tracking.
S102, screening out a target mask from the tracking masks, and obtaining the outline position attribute of the target mask.
Referring to fig. 5, step S102 specifically includes:
s1021, searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
s1022, acquiring the position attribute and the area of each contour in the contour list through a boundinget () function in python library opencv;
for example, the location attributes and areas of the contours are as follows:
Figure 823069DEST_PATH_IMAGE003
Figure 756390DEST_PATH_IMAGE004
wherein x isi、yiIndicates the position of the top left corner of the ith contour, widthi、heightiWidth and height of i-th contour, aeraiDenotes the area of the ith contour, i = 0,1,2. cv2 represents an opencv library, and bonudnglect represents a function in the opencv library.
And S1023, acquiring the contour with the largest area, taking the contour with the largest area as the contour of the target mask, and taking the position attribute of the contour with the largest area as the contour position attribute of the target mask.
Wherein all the contour areas (aera) are obtained by the step S10220,aera1,aera2…), since the target mask is typically a larger region in the trace mask, the contour with the largest area is found, assuming the resulting largest area is aerakThat is, the kth contour is the contour with the largest area, and the vertex coordinate of the upper left corner of the contour is obtained as (x)i,yi) Width and height are width respectivelyk、heightk
For example, the target object is a human hand, and the outline of the hand is the largest in all the outlines, and the outline areas of other interference regions are smaller than the contour areas, so that the target mask can be quickly screened out by using the size of the outline areas.
S103, converting the outline position attribute of the target mask into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
Referring to fig. 6, step S103 specifically includes:
s1031, calculating the proportion of the target object relative to the size of the original image according to the contour position attribute of the target mask;
specifically, the ratio of the target object to the original image size is calculated by using the following formula:
Figure 159690DEST_PATH_IMAGE005
wherein the content of the first and second substances,WHrepresenting width, height, x of the original imagek、ykRepresents the coordinate position, width, of the top left corner vertex of the contour k with the largest areak、heightkWidth and height of the profile k representing the largest area, pkx、pkyThe ratio of the center position of the minimum rectangular bounding box representing the contour k of the largest area to the original image size, pkw、pkhThe contour k representing the maximum contour is the ratio of the width and height of the minimum rectangular bounding box to the original image size.
S1032, converting the position of the target object relative to the image size into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
The yolo series comprises yolov 1-yolov 5 versions, the yolo series label files are identical in format, the content of the yolo series target detection label file is the proportion of a target label and a target object relative to the size of an original image, and the yolo series label file comprises the following steps:
Figure 746529DEST_PATH_IMAGE006
where 2 denotes the target label and the following decimal places denote the ratio of the target object to the original image size (i.e., p)kx、pky、pkw、pkhValue of) i.e. label, pkx、pky、pkw、pkhA text file (txt file) is written, where label represents the tagged target tag. The object labels are indicated by numbers and different objects are indicated by different labels.
In addition, referring to fig. 7, as a specific example, after step S103, the method further includes steps S201 to S203:
s201, storing the yolo series target detection label file and the corresponding image thereof in a preset folder;
wherein, the label file in step S103 is saved in the databases/labels folder, and the corresponding image is saved in the databases/images file, and the name of the label file is the same as that of the image file, for example: tag file 1.txt, corresponding to image name 1. jpg.
S202, dividing the marked files in the preset folder into a training set, a verification set and a test set according to a preset division ratio;
the marked files under the data sets are divided into a training set, a verification set and a test set, and the division proportion is divided according to the actual conditions of the project.
And S203, performing model training on the neural network model of the yolo series through the training set, the verification set and the test set, wherein the trained neural network training model is used for target detection.
In summary, according to the target tracking and labeling method provided by this embodiment, the siammask algorithm is combined with video labeling, the siammask algorithm with a better tracking effect is used for target tracking, the anti-interference performance is strong, the target object is tracked through the siammask target tracking algorithm, the target mask is screened from the tracking mask after the tracking mask is obtained, the contour position attribute of the target mask is obtained, and then the contour position attribute of the target mask is converted into a format required by yolo series target detection, that is, the labeling of the target object in the video is realized, so that automatic target tracking and labeling are completed, and the labeling efficiency is improved.
Referring to fig. 8, a target tracking and labeling apparatus according to an embodiment of the present invention includes:
the acquisition tracking module is used for acquiring a target object to be tracked and labeled from a video and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask;
the screening and obtaining module is used for screening a target mask from the tracking masks and obtaining the outline position attribute of the target mask;
and the conversion generation module is used for converting the contour position attribute of the target mask into a format required by the yolo series target detection so as to obtain a yolo series target detection label file and a corresponding image thereof.
In this embodiment, the acquisition tracking module is specifically configured to:
selecting a target object in a frame of a video containing the target object to be tracked and labeled;
tracking the target object by using a siammask target tracking algorithm;
and acquiring a tracking mask obtained when the target object is tracked by a siammask target tracking algorithm.
In this embodiment, the screening acquisition module is specifically configured to:
searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
acquiring the position attribute and the area of each outline in the outline list through a bounngselect () function in python library opencv;
and acquiring the outline with the largest area, taking the outline with the largest area as the outline of the target mask, and taking the position attribute of the outline with the largest area as the outline position attribute of the target mask.
In this embodiment, the conversion generating module is specifically configured to:
calculating the proportion of the target object relative to the size of the original image according to the contour position attribute of the target mask;
and converting the position of the target object relative to the size of the image into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
In this embodiment, the conversion generating module is specifically configured to calculate a ratio of the target object to the original image size by using the following formula:
Figure 235279DEST_PATH_IMAGE007
wherein the content of the first and second substances,WHrepresenting width, height, x of the original imagek、ykRepresents the coordinate position, width, of the top left corner vertex of the contour k with the largest areak、heightkWidth and height of the profile k representing the largest area, pkx、pkyThe ratio of the center position of the minimum rectangular bounding box representing the contour k of the largest area to the original image size, pkw、pkhThe contour k representing the maximum contour is the ratio of the width and height of the minimum rectangular bounding box to the original image size.
In this embodiment, the apparatus further includes:
the storage module is used for storing the yolo series target detection label file and the corresponding image thereof in a preset folder;
the dividing module is used for dividing the marked files in the preset folder into a training set, a verification set and a test set according to a preset dividing proportion;
and the training module is used for carrying out model training on the neural network model of the yolo series through the training set, the verification set and the test set, and the trained neural network training model is used for target detection.
According to the target tracking and labeling device provided by the embodiment, the siammask algorithm is combined with video labeling, the siammask algorithm with a good tracking effect is used for target tracking, the anti-interference performance is strong, the target object is tracked through the siammask target tracking algorithm, the target mask is screened out from the tracking mask after the tracking mask is obtained, the contour position attribute of the target mask is obtained, then the contour position attribute of the target mask is converted into a format required by yolo series target detection, namely, the target object in the video is labeled, so that automatic target tracking and labeling are completed, and the labeling efficiency is improved.
Furthermore, an embodiment of the present invention also proposes a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the above-mentioned method.
Furthermore, an embodiment of the present invention also provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the program.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A method for target tracking and labeling, the method comprising:
acquiring a target object to be tracked and labeled from a video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask;
screening a target mask from the tracking masks, and obtaining the contour position attribute of the target mask;
and converting the contour position attribute of the target mask into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
2. The target tracking and labeling method according to claim 1, wherein the step of obtaining a target object to be tracked and labeled from a video, and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask specifically comprises:
selecting a target object in a frame of a video containing the target object to be tracked and labeled;
tracking the target object by using a siammask target tracking algorithm;
and acquiring a tracking mask obtained when the target object is tracked by a siammask target tracking algorithm.
3. The target tracking and labeling method according to claim 1, wherein the step of screening out a target mask from the tracking masks and obtaining the contour position attribute of the target mask specifically comprises:
searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
acquiring the position attribute and the area of each outline in the outline list through a bounngselect () function in python library opencv;
and acquiring the outline with the largest area, taking the outline with the largest area as the outline of the target mask, and taking the position attribute of the outline with the largest area as the outline position attribute of the target mask.
4. The method for tracking and labeling the target according to claim 3, wherein the step of converting the contour position attribute of the target mask into a format required for the detection of the yolo series target to obtain the yolo series target detection tag file and the corresponding image thereof specifically comprises:
calculating the proportion of the target object relative to the size of the original image according to the contour position attribute of the target mask;
and converting the position of the target object relative to the size of the image into a format required by the yolo series target detection to obtain a yolo series target detection label file and a corresponding image thereof.
5. The method for tracking and labeling a target according to claim 4, wherein in the step of calculating the ratio of the target object to the original image size according to the contour position attribute of the target mask, the ratio of the target object to the original image size is calculated by using the following formula:
Figure 546047DEST_PATH_IMAGE001
wherein the content of the first and second substances,WHrepresenting width, height, x of the original imagek、ykRepresents the coordinate position, width, of the top left corner vertex of the contour k with the largest areak、heightkWidth and height of the profile k representing the largest area, pkx、pkyThe ratio of the center position of the minimum rectangular bounding box representing the contour k of the largest area to the original image size, pkw、pkhThe contour k representing the maximum contour is the ratio of the width and height of the minimum rectangular bounding box to the original image size.
6. The method for tracking and labeling the target according to any one of claims 1 to 5, wherein after the step of converting the outline position attribute of the target mask into a format required for the detection of the yolo series target to obtain the yolo series target detection tag file and the corresponding image thereof, the method further comprises:
storing the yolo series target detection label file and the corresponding image thereof in a preset folder;
dividing the marked files in the preset folder into a training set, a verification set and a test set according to a preset division ratio;
and performing model training on the neural network model of the yolo series through the training set, the verification set and the test set, wherein the trained neural network training model is used for target detection.
7. An object tracking and labeling apparatus, characterized in that the apparatus comprises:
the acquisition tracking module is used for acquiring a target object to be tracked and labeled from a video and tracking the target object by using a siammask target tracking algorithm to obtain a tracking mask;
the screening and obtaining module is used for screening a target mask from the tracking masks and obtaining the outline position attribute of the target mask;
and the conversion generation module is used for converting the contour position attribute of the target mask into a format required by the yolo series target detection so as to obtain a yolo series target detection label file and a corresponding image thereof.
8. The target tracking and labeling apparatus of claim 7, wherein the screening acquisition module is specifically configured to:
searching the outline of each object in the tracking mask through a findContours () function of an opencv library to obtain an outline list;
acquiring the position attribute and the area of each outline in the outline list through a bounngselect () function in python library opencv;
and acquiring the outline with the largest area, taking the outline with the largest area as the outline of the target mask, and taking the position attribute of the outline with the largest area as the outline position attribute of the target mask.
9. A readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the program.
CN202110604174.9A 2021-05-31 2021-05-31 Target tracking and labeling method and device, readable storage medium and computer equipment Pending CN113034551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110604174.9A CN113034551A (en) 2021-05-31 2021-05-31 Target tracking and labeling method and device, readable storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110604174.9A CN113034551A (en) 2021-05-31 2021-05-31 Target tracking and labeling method and device, readable storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN113034551A true CN113034551A (en) 2021-06-25

Family

ID=76455927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110604174.9A Pending CN113034551A (en) 2021-05-31 2021-05-31 Target tracking and labeling method and device, readable storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN113034551A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077521A (en) * 2013-01-08 2013-05-01 天津大学 Area-of-interest extracting method used for video monitoring
CN103324955A (en) * 2013-06-14 2013-09-25 浙江智尔信息技术有限公司 Pedestrian detection method based on video processing
CN109377511A (en) * 2018-08-30 2019-02-22 西安电子科技大学 Motion target tracking method based on sample combination and depth detection network
CN109934848A (en) * 2019-03-07 2019-06-25 贵州大学 A method of the moving object precise positioning based on deep learning
US20190347806A1 (en) * 2018-05-09 2019-11-14 Figure Eight Technologies, Inc. Video object tracking
CN110929560A (en) * 2019-10-11 2020-03-27 杭州电子科技大学 Video semi-automatic target labeling method integrating target detection and tracking

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077521A (en) * 2013-01-08 2013-05-01 天津大学 Area-of-interest extracting method used for video monitoring
CN103324955A (en) * 2013-06-14 2013-09-25 浙江智尔信息技术有限公司 Pedestrian detection method based on video processing
US20190347806A1 (en) * 2018-05-09 2019-11-14 Figure Eight Technologies, Inc. Video object tracking
US20200151884A1 (en) * 2018-05-09 2020-05-14 Figure Eight Technologies, Inc. Video object tracking
CN109377511A (en) * 2018-08-30 2019-02-22 西安电子科技大学 Motion target tracking method based on sample combination and depth detection network
CN109934848A (en) * 2019-03-07 2019-06-25 贵州大学 A method of the moving object precise positioning based on deep learning
CN110929560A (en) * 2019-10-11 2020-03-27 杭州电子科技大学 Video semi-automatic target labeling method integrating target detection and tracking

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QIANG WANG ET.AL: "Fast Online Object Tracking and Segmentation: A Unifying Approach", 《ARXIV.CORR: 1812.05050V2》 *
张峰瑞: "基于多目标跟踪的半自动化视频标注研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
朴松昊 等: "《智能机械器人》", 31 December 2012, 哈尔滨工业大学出版社 *

Similar Documents

Publication Publication Date Title
CN108537269B (en) Weak interactive object detection deep learning method and system thereof
CN111160469B (en) Active learning method of target detection system
US11238312B2 (en) Automatically generating labeled synthetic documents
CN110796143A (en) Scene text recognition method based on man-machine cooperation
CN113378710B (en) Layout analysis method and device for image file, computer equipment and storage medium
CN108665742A (en) A kind of method and apparatus read by arrangement for reading
US8804139B1 (en) Method and system for repurposing a presentation document to save paper and ink
JP6612486B1 (en) Learning device, classification device, learning method, classification method, learning program, and classification program
CN111857893A (en) Method and device for generating label graph
CN111597628B (en) Model marking method and device, storage medium and electronic equipment
CN115601672B (en) VR intelligent shop patrol method and device based on deep learning
CN113283355A (en) Form image recognition method and device, computer equipment and storage medium
CN109740674A (en) A kind of image processing method, device, equipment and storage medium
CN113591884B (en) Method, device, equipment and storage medium for determining character recognition model
CN113822144A (en) Target detection method and device, computer equipment and storage medium
CN113034551A (en) Target tracking and labeling method and device, readable storage medium and computer equipment
CN110633251B (en) File conversion method and equipment
CN109919156B (en) Training method, medium and device of image cropping prediction model and computing equipment
US8488183B2 (en) Moving labels in graphical output to avoid overprinting
CN115147474A (en) Point cloud annotation model generation method and device, electronic equipment and storage medium
CN111160265B (en) File conversion method and device, storage medium and electronic equipment
CN114090630A (en) Commodity data integration method based on distributed micro-service cluster
CN114004918A (en) Poster generation method, device and medium
CN112232431A (en) Watermark detection model training method, watermark detection method, system, device and medium
CN117079084B (en) Sample image generation method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210625

RJ01 Rejection of invention patent application after publication