CN111415373A - Target tracking and segmenting method, system and medium based on twin convolutional network - Google Patents

Target tracking and segmenting method, system and medium based on twin convolutional network Download PDF

Info

Publication number
CN111415373A
CN111415373A CN202010202511.7A CN202010202511A CN111415373A CN 111415373 A CN111415373 A CN 111415373A CN 202010202511 A CN202010202511 A CN 202010202511A CN 111415373 A CN111415373 A CN 111415373A
Authority
CN
China
Prior art keywords
target
tracking
feature map
map
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010202511.7A
Other languages
Chinese (zh)
Inventor
盛校粼
李凡平
石柱国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yisa Technology Co ltd
Qingdao Yisa Data Technology Co Ltd
Original Assignee
Beijing Yisa Technology Co ltd
Qingdao Yisa Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yisa Technology Co ltd, Qingdao Yisa Data Technology Co Ltd filed Critical Beijing Yisa Technology Co ltd
Priority to CN202010202511.7A priority Critical patent/CN111415373A/en
Publication of CN111415373A publication Critical patent/CN111415373A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target tracking and segmenting method based on a twin convolutional network, which comprises the following steps: extracting image features by adopting a densely connected convolutional neural network to obtain a target feature map and a tracking area feature map; performing cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram; convolving the image data and sending the convolved image data to a semantic segmentation branch and a score map branch to obtain a first feature map and a score map, setting each pixel in the first feature map and a corresponding channel thereof as ROW, and setting each pixel in the score map as a confidence corresponding to each ROW in the first feature map; and selecting the ROW corresponding to the pixel point with the highest confidence degree in the score map on the first characteristic map, converting the corresponding ROW into a first matrix, performing secondary classification to obtain a mask matrix, processing the mask matrix to obtain a target mask, and acquiring a boundary frame of the tracked target according to the target mask. The method improves the target tracking precision and realizes the pixel-level tracking of the target.

Description

Target tracking and segmenting method, system and medium based on twin convolutional network
Technical Field
The invention relates to the technical field of computer vision, in particular to a target tracking and segmenting method, a target tracking and segmenting system, a target tracking and segmenting terminal and a target tracking and segmenting medium based on a twin convolutional network.
Background
In recent years, with the rise of artificial intelligence and deep learning, convolutional neural network algorithms gradually enter the target tracking field and gain unsophisticated performance and achievements, wherein an algorithm frame based on a twin convolutional network receives great attention in the international computer vision conference and tracking events in recent years by virtue of good performance and simple network structure.
In order to facilitate the expression of the tracking result, the original tracking algorithm returns the target tracking result by using a rectangular box with aligned coordinate axes. However, as the tracking accuracy is improved, the difficulty of the data set is improved, and a rotating rectangular box is proposed to be used as a mark in the VOT 2015. An automatic method of generating a rotating frame through a mask is proposed at the time of the VOT2016, but the requirement of a diversified target tracking task cannot be met.
Disclosure of Invention
Aiming at the defects in the prior art, the embodiment of the invention provides a target tracking and segmenting method, a target tracking and segmenting system, a target tracking and segmenting terminal and a target tracking and segmenting medium based on a twin convolutional network.
In a first aspect, a target tracking and segmenting method based on a twin convolutional network provided in an embodiment of the present invention includes:
acquiring input image information;
extracting input image features by adopting a densely connected convolutional neural network to obtain a target feature map and a tracking area feature map;
performing cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram;
after convolution is carried out on the output feature map, the output feature map is respectively sent to a semantic segmentation branch and a score map branch to obtain a first feature map and a score map, each pixel in the first feature map and a corresponding channel of each pixel are set to be ROW, and each pixel in the score map is a confidence corresponding to each ROW in the first feature map;
selecting an ROW corresponding to a pixel point with the highest confidence degree in a score map on a first feature map, converting the corresponding ROW into a first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the mask matrix to an original image through affine transformation, binarizing the numerical value between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of a tracking target in the original image, and acquiring a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
In a second aspect, an embodiment of the present invention provides a target tracking and segmenting system based on a twin convolutional network, including: an image acquisition module, an image feature extraction module, a cross-correlation module, a first analysis module and a second analysis module,
the image acquisition module is used for acquiring input image information;
the image feature extraction module adopts a densely connected convolutional neural network to extract the features of an input image to obtain a target feature map and a tracking area feature map;
the cross-correlation module performs cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram;
the first analysis module is used for convolving the output feature map and then respectively sending the convolved output feature map into the semantic segmentation branch and the score map branch to obtain a first feature map and a score map, setting each pixel in the first feature map and a corresponding channel thereof as ROW, and setting each pixel in the score map as a confidence corresponding to each ROW in the first feature map;
the second analysis module is used for selecting the ROW corresponding to the pixel point with the highest reliability in the score map on the first feature map, converting the corresponding ROW into the first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the elements of the mask matrix into values between 0 and 1 after the two classes, performing affine transformation on the mask matrix to map the mask matrix back to the original image, performing binarization on the values between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of the tracking target in the original image, and obtaining a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
In a third aspect, an intelligent terminal provided in an embodiment of the present invention includes a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, the memory is used to store a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method steps described in the foregoing embodiment.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions, which, when executed by a processor, cause the processor to perform the method steps described in the above embodiments.
The invention has the beneficial effects that:
the target tracking and segmenting method, the target tracking and segmenting system, the target tracking and segmenting terminal and the target tracking and segmenting medium based on the twin convolutional network provided by the embodiment of the invention adopt the convolutional neural network with dense connection to extract image characteristics, and add the semantic segmentation branch and the score map branch, thereby improving the target tracking precision and realizing the pixel-level tracking of the target.
Drawings
In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below. Throughout the drawings, like elements or portions are generally identified by like reference numerals. In the drawings, elements or portions are not necessarily drawn to scale.
FIG. 1 is a flow chart illustrating a target tracking and segmenting method based on a twin convolutional network according to a first embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a target tracking and segmenting system based on a twin convolutional network according to another embodiment of the present invention;
fig. 3 shows a schematic structural diagram of an intelligent terminal according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which the invention pertains.
As shown in fig. 1, a flowchart of a target tracking and segmenting method based on a twin convolutional network according to a first embodiment of the present invention is shown, and the method includes the following steps:
s1: input image information is acquired.
S2: and extracting the features of the input image by adopting a densely connected convolutional neural network to obtain a target feature map and a tracking area feature map.
S3: performing cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram;
s4: and after convolution, the output feature map is respectively sent to a semantic segmentation branch and a score map branch to obtain a first feature map and a score map, each pixel in the first feature map and a corresponding channel thereof are set as ROW, and each pixel in the score map is a confidence corresponding to each ROW in the first feature map.
S5: selecting an ROW corresponding to a pixel point with the highest confidence degree in a score map on a first feature map, converting the corresponding ROW into a first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the mask matrix to an original image through affine transformation, binarizing the numerical value between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of a tracking target in the original image, and acquiring a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
The above technical solution is described in detail below using a specific example.
Acquiring two input images, wherein the dimension of one image is 127 × 3, the dimension of the other image is 255 × 3, inputting the two images into a convolution neural network which is connected in a dense mode respectively for feature extraction, dividing the feature extraction network into two paths, respectively extracting target features and tracking area features, outputting a target image (the scale is 127 × 127) by a full convolution network, and outputting a tracking area image (the scale is 255). The feature extraction process is represented by the following mathematical expression:
xl=Hl([x0,x1,...,xl-1])
wherein HlRepresents a network extraction feature operation, [ x ]0,x1,...,xl-1]Indicating that the feature maps from the first layer to the last layer are merged as channels, xlThen for the output of the feature extraction network, a target feature map dimension of 15 × 256 and a tracking region feature map dimension of 31 × 256 are obtained, respectively. And (3) performing correlation operation on the target feature map dimension of 15 × 256 and the tracking region feature map dimension of 31 × 256, padding of 0 and step size of 1 to obtain an output feature map with the dimension of 17 × 256. The output feature map with the dimension of 17 × 256 is sent to a semantic segmentation branch and a score map (score map) branch respectively, the semantic segmentation branch and the score map branch are formed by 1 × 1 convolution, the output feature map is subjected to 1 × 1 convolution to obtain a first feature map (fmask) with the dimension of 17 × 17 (63 × 63) and a score map with the dimension of 17 × 1 respectively, each pixel and a corresponding channel in the fmask are called RoW, namely, response of a candidate window, so that the fmask is totally 17 × 17 RoW, and each RoW has the dimension of 1 × 1 (63 × 63). And each pixel in the score map is a confidence corresponding to each RoW in the fmask, and RoW corresponding to the pixel point with the highest confidence in the score map on the fmask is selected as RoW used when the mask is finally generated. After obtaining dimension 1 x 1 (63 x 63) RoW, selected RoW resize was set to 6363 and simultaneously carrying out sigmoid two classification on the first matrix, and judging whether the pixels on the matrix generated by RoW belong to a mask (mask). After the two classifications, a mask matrix is obtained, elements of the first matrix are values between 0 and 1 after sigmoid, the mask matrix is mapped back to the original image through affine transformation, a numerical value between 0 and 1 in the mask matrix is binarized by setting a segmentation threshold (in the embodiment, 0.35 is selected as the segmentation threshold of the mask), and finally the mask of the tracking target in the original image can be obtained, so that a boundary frame (bounding box) of the tracking target can be obtained through the minimum bounding rectangle of the target mask.
The method was performed on a DAVIS2016 dataset and compared to other state-of-the-art tracking algorithms (including traditional twin network based tracking algorithms) for various performance indicators, with the results shown in table 1:
Figure BDA0002419870420000061
TABLE 1
The method is adopted on a DAVIS2017 data set, and various performance indexes are compared with other state-of-the-art tracking algorithms (including the traditional twin network-based tracking algorithm), and the results are shown in the following table 2:
Figure BDA0002419870420000071
TABLE 2
The data in tables 1 and 2 show that the target tracking and segmenting method based on the twin convolutional network provided by the embodiment of the invention has obviously better performance than various methods in the prior art.
According to the target tracking and segmenting method based on the twin convolutional network, the image features are extracted by adopting the convolutional neural network in intensive connection, the extracting capability of the network features is improved, the semantic segmentation branches and the score map branches are added into the convolutional neural network in intensive connection, the target tracking precision is improved, and the pixel-level tracking of the target is realized.
In the first embodiment, a twin convolutional network based target tracking and segmentation method is provided, and correspondingly, the present application also provides a twin convolutional network based target tracking and segmentation system. Please refer to fig. 2, which is a schematic diagram of a target tracking and segmenting system based on a twin convolutional network according to a second embodiment of the present invention. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points.
As shown in fig. 2, there is shown a schematic structural diagram of a target tracking and segmenting system based on a twin convolutional network according to another embodiment of the present invention, where the system includes: the device comprises an image acquisition module, an image feature extraction module, a cross-correlation module, a first analysis module and a second analysis module, wherein the image acquisition module is used for acquiring input image information; the image feature extraction module adopts a densely connected convolutional neural network to extract the features of an input image to obtain a target feature map and a tracking area feature map; the cross-correlation module performs cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram; the first analysis module is used for convolving the output feature map and then respectively sending the convolved output feature map into the semantic segmentation branch and the score map branch to obtain a first feature map and a score map, setting each pixel in the first feature map and a corresponding channel thereof as ROW, and setting each pixel in the score map as a confidence corresponding to each ROW in the first feature map; the second analysis module is used for selecting the ROW corresponding to the pixel point with the highest reliability in the score map on the first feature map, converting the corresponding ROW into the first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the elements of the mask matrix into values between 0 and 1 after the two classes, performing affine transformation on the mask matrix to map the mask matrix back to the original image, performing binarization on the values between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of the tracking target in the original image, and obtaining a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
The above technical solution is described in detail below using a specific example.
The image acquisition module acquires two input images, the dimensionality of one image is 127 × 3, the dimensionality of the other image is 255 × 3, the two input images are respectively input into the image feature extraction module, the image feature extraction module adopts a convolution neural network in dense connection to extract features, the feature extraction network is divided into two paths and is respectively used for extracting target features and tracking region features, the full convolution network outputs a target image (the scale is 127 × 127), and a tracking region image (the scale is 255) is output. The image feature extraction module feature extraction process is represented by the following mathematical expression:
xl=Hl([x0,x1,...,xl-1])
wherein HlRepresents a network extraction feature operation, [ x ]0,x1,...,xl-1]Indicating that the feature maps from the first layer to the last layer are merged as channels, xlThen for the output of the feature extraction network, a target feature map dimension of 15 × 256 and a tracking region feature map dimension of 31 × 256 are obtained, respectively. And the cross-correlation module performs correlation operation on the target feature map dimension of 15 × 256 and the tracking region feature map dimension of 31 × 256, padding is 0, and the step size is 1, so as to obtain an output feature map with the dimension of 17 × 256. The cross-correlation module sends the output feature maps with the dimensions of 17 × 256 to the first analysis module respectively, the first analysis module sends the output feature maps to the semantic segmentation branch and the score map (score map) branch respectively, the semantic segmentation branch and the score map branch are formed by convolution with 1 × 1, the output feature maps are convolved with 1 × 1 to obtain the first feature map (fmask) with the dimensions of 17 × 17 (63) and the score map with the dimensions of 17 × 17 1 respectively, and each pixel and the corresponding channel in fmask are called RoW, namely, response of a candidate window, so that 17 × 17 pixels RoW are contained in fmask in total, and the dimension of each RoW is 1 × 1 (63). And each pixel in the score map is a confidence corresponding to each RoW in the fmask, and the second analysis module selects RoW corresponding to the pixel point with the highest confidence in the score map on the fmask as RoW used when the mask is finally generated. When the dimension 1 x 1 (63 x 63) is obtained as RoW, the selected RoW resize is formed into a first matrix of 63 x 63, and the first matrix is sigmoid-bisectedClass, for determining RoW whether the pixel on the generated matrix belongs to a mask (mask). After the two classifications, a mask matrix is obtained, elements of the first matrix are values between 0 and 1 after sigmoid, the mask matrix is mapped back to the original image through affine transformation, a numerical value between 0 and 1 in the mask matrix is binarized by setting a segmentation threshold (in the embodiment, 0.35 is selected as the segmentation threshold of the mask), and finally the mask of the tracking target in the original image can be obtained, so that a boundary frame (bounding box) of the tracking target can be obtained through the minimum bounding rectangle of the target mask.
According to the target tracking and segmenting system based on the twin convolutional network, the image features are extracted by adopting the densely connected convolutional neural network, the network feature extraction capability is improved, the semantic segmentation branches and the score map branches are added into the densely connected convolutional neural network, the target tracking precision is improved, and the pixel-level tracking of the target is realized.
As shown in fig. 3, a schematic diagram of an intelligent terminal according to a third embodiment of the present invention is provided, where the terminal includes a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, the memory is used for storing a computer program, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method described in the first embodiment.
It should be understood that in the embodiments of the present invention, the Processor may be a Central Processing Unit (CPU), and the Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input devices may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, etc., and the output devices may include a display (L CD, etc.), a speaker, etc.
The memory may include both read-only memory and random access memory, and provides instructions and data to the processor. The portion of memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In a specific implementation, the processor, the input device, and the output device described in the embodiments of the present invention may execute the implementation described in the method embodiments provided in the embodiments of the present invention, and may also execute the implementation described in the system embodiments in the embodiments of the present invention, which is not described herein again.
The invention also provides an embodiment of a computer-readable storage medium, in which a computer program is stored, which computer program comprises program instructions that, when executed by a processor, cause the processor to carry out the method described in the above embodiment.
The computer readable storage medium may be an internal storage unit of the terminal described in the foregoing embodiment, for example, a hard disk or a memory of the terminal. The computer readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computer-readable storage medium is used for storing the computer program and other programs and data required by the terminal. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims (10)

1. A target tracking and segmenting method based on a twin convolutional network is characterized by comprising the following steps:
acquiring input image information;
extracting input image features by adopting a densely connected convolutional neural network to obtain a target feature map and a tracking area feature map;
performing cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram;
after convolution is carried out on the output feature map, the output feature map is respectively sent to a semantic segmentation branch and a score map branch to obtain a first feature map and a score map, each pixel in the first feature map and a corresponding channel of each pixel are set to be ROW, and each pixel in the score map is a confidence corresponding to each ROW in the first feature map;
selecting an ROW corresponding to a pixel point with the highest confidence degree in a score map on a first feature map, converting the corresponding ROW into a first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the mask matrix to an original image through affine transformation, binarizing the numerical value between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of a tracking target in the original image, and acquiring a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
2. The twin convolutional network-based target tracking and segmentation method of claim 1 wherein the semantic segmentation branches are composed of 1 x 1 convolutional layers.
3. The twin convolutional network-based target tracking and segmentation method of claim 1 wherein the score map branches are comprised of 1 x 1 convolutional layers.
4. The twin convolutional network based target tracking and segmentation method of any one of claims 1 to 3, wherein the segmentation threshold is 0.35.
5. A twin convolutional network based target tracking and segmentation system, comprising: an image acquisition module, an image feature extraction module, a cross-correlation module, a first analysis module and a second analysis module,
the image acquisition module is used for acquiring input image information;
the image feature extraction module adopts a densely connected convolutional neural network to extract the features of an input image to obtain a target feature map and a tracking area feature map;
the cross-correlation module performs cross-correlation operation on the target characteristic diagram and the tracking area characteristic diagram to obtain an output characteristic diagram;
the first analysis module is used for convolving the output feature map and then respectively sending the convolved output feature map into the semantic segmentation branch and the score map branch to obtain a first feature map and a score map, setting each pixel in the first feature map and a corresponding channel thereof as ROW, and setting each pixel in the score map as a confidence corresponding to each ROW in the first feature map;
the second analysis module is used for selecting the ROW corresponding to the pixel point with the highest reliability in the score map on the first feature map, converting the corresponding ROW into the first matrix, classifying the first matrix into two classes to obtain a mask matrix, mapping the elements of the mask matrix into values between 0 and 1 after the two classes, performing affine transformation on the mask matrix to map the mask matrix back to the original image, performing binarization on the values between 0 and 1 in the mask matrix by using a set segmentation threshold value to obtain a target mask of the tracking target in the original image, and obtaining a boundary frame of the tracking target by using the minimum circumscribed rectangle of the target mask.
6. The twin convolutional network-based target tracking and segmentation system of claim 5 wherein the semantic segmentation branches are composed of 1 x 1 convolutional layers.
7. The twin convolutional network-based target tracking and segmentation system of claim 5 wherein the score map branches are comprised of 1 x 1 convolutional layers.
8. A twin convolutional network based target tracking and segmentation system as claimed in any of claims 5 to 7 wherein the segmentation threshold is 0.35.
9. An intelligent terminal comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, the memory being adapted to store a computer program, the computer program comprising program instructions, characterized in that the processor is configured to invoke the program instructions to perform the method steps according to any of claims 1 to 4.
10. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method steps according to any one of claims 1 to 4.
CN202010202511.7A 2020-03-20 2020-03-20 Target tracking and segmenting method, system and medium based on twin convolutional network Pending CN111415373A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010202511.7A CN111415373A (en) 2020-03-20 2020-03-20 Target tracking and segmenting method, system and medium based on twin convolutional network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010202511.7A CN111415373A (en) 2020-03-20 2020-03-20 Target tracking and segmenting method, system and medium based on twin convolutional network

Publications (1)

Publication Number Publication Date
CN111415373A true CN111415373A (en) 2020-07-14

Family

ID=71493191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010202511.7A Pending CN111415373A (en) 2020-03-20 2020-03-20 Target tracking and segmenting method, system and medium based on twin convolutional network

Country Status (1)

Country Link
CN (1) CN111415373A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347852A (en) * 2020-10-10 2021-02-09 上海交通大学 Target tracking and semantic segmentation method and device for sports video and plug-in
CN112949458A (en) * 2021-02-26 2021-06-11 北京达佳互联信息技术有限公司 Training method of target tracking segmentation model and target tracking segmentation method and device
CN113177943A (en) * 2021-06-29 2021-07-27 中南大学 Cerebral apoplexy CT image segmentation method
CN115063594A (en) * 2022-08-19 2022-09-16 清驰(济南)智能科技有限公司 Feature extraction method and device based on automatic driving
CN117876428A (en) * 2024-03-12 2024-04-12 金锐同创(北京)科技股份有限公司 Target tracking method, device, computer equipment and medium based on image processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665485A (en) * 2018-04-16 2018-10-16 华中科技大学 A kind of method for tracking target merged with twin convolutional network based on correlation filtering
CN109191491A (en) * 2018-08-03 2019-01-11 华中科技大学 The method for tracking target and system of the twin network of full convolution based on multilayer feature fusion
CN109409371A (en) * 2017-08-18 2019-03-01 三星电子株式会社 The system and method for semantic segmentation for image
CN110188753A (en) * 2019-05-21 2019-08-30 北京以萨技术股份有限公司 One kind being based on dense connection convolutional neural networks target tracking algorism
CN110276285A (en) * 2019-06-13 2019-09-24 浙江工业大学 A kind of shipping depth gauge intelligent identification Method in uncontrolled scene video
CN110633632A (en) * 2019-08-06 2019-12-31 厦门大学 Weak supervision combined target detection and semantic segmentation method based on loop guidance

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409371A (en) * 2017-08-18 2019-03-01 三星电子株式会社 The system and method for semantic segmentation for image
CN108665485A (en) * 2018-04-16 2018-10-16 华中科技大学 A kind of method for tracking target merged with twin convolutional network based on correlation filtering
CN109191491A (en) * 2018-08-03 2019-01-11 华中科技大学 The method for tracking target and system of the twin network of full convolution based on multilayer feature fusion
CN110188753A (en) * 2019-05-21 2019-08-30 北京以萨技术股份有限公司 One kind being based on dense connection convolutional neural networks target tracking algorism
CN110276285A (en) * 2019-06-13 2019-09-24 浙江工业大学 A kind of shipping depth gauge intelligent identification Method in uncontrolled scene video
CN110633632A (en) * 2019-08-06 2019-12-31 厦门大学 Weak supervision combined target detection and semantic segmentation method based on loop guidance

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QIANG WANG ET AL: "Fast Online Object Tracking and Segmentation: A Unifying Approach", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 1328 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347852A (en) * 2020-10-10 2021-02-09 上海交通大学 Target tracking and semantic segmentation method and device for sports video and plug-in
CN112347852B (en) * 2020-10-10 2022-07-29 上海交通大学 Target tracking and semantic segmentation method and device for sports video and plug-in
CN112949458A (en) * 2021-02-26 2021-06-11 北京达佳互联信息技术有限公司 Training method of target tracking segmentation model and target tracking segmentation method and device
CN113177943A (en) * 2021-06-29 2021-07-27 中南大学 Cerebral apoplexy CT image segmentation method
CN113177943B (en) * 2021-06-29 2021-09-07 中南大学 Cerebral apoplexy CT image segmentation method
CN115063594A (en) * 2022-08-19 2022-09-16 清驰(济南)智能科技有限公司 Feature extraction method and device based on automatic driving
CN115063594B (en) * 2022-08-19 2022-12-13 清驰(济南)智能科技有限公司 Feature extraction method and device based on automatic driving
CN117876428A (en) * 2024-03-12 2024-04-12 金锐同创(北京)科技股份有限公司 Target tracking method, device, computer equipment and medium based on image processing
CN117876428B (en) * 2024-03-12 2024-05-17 金锐同创(北京)科技股份有限公司 Target tracking method, device, computer equipment and medium based on image processing

Similar Documents

Publication Publication Date Title
CN110060237B (en) Fault detection method, device, equipment and system
CN111415373A (en) Target tracking and segmenting method, system and medium based on twin convolutional network
US9367766B2 (en) Text line detection in images
CN111461170A (en) Vehicle image detection method and device, computer equipment and storage medium
CN112580668B (en) Background fraud detection method and device and electronic equipment
CN112380978B (en) Multi-face detection method, system and storage medium based on key point positioning
CN112364873A (en) Character recognition method and device for curved text image and computer equipment
CN113011253B (en) Facial expression recognition method, device, equipment and storage medium based on ResNeXt network
CN111368632A (en) Signature identification method and device
CN114972947B (en) Depth scene text detection method and device based on fuzzy semantic modeling
CN112749576B (en) Image recognition method and device, computing equipment and computer storage medium
CN113269752A (en) Image detection method, device terminal equipment and storage medium
CN113129298A (en) Definition recognition method of text image
CN110363762B (en) Cell detection method, cell detection device, intelligent microscope system and readable storage medium
CN112070035A (en) Target tracking method and device based on video stream and storage medium
CN111862159A (en) Improved target tracking and segmentation method, system and medium for twin convolutional network
CN109213515B (en) Multi-platform lower buried point normalization method and device and electronic equipment
CN115424293A (en) Living body detection method, and training method and device of living body detection model
CN115223173A (en) Object identification method and device, electronic equipment and storage medium
CN115345895A (en) Image segmentation method and device for visual detection, computer equipment and medium
CN112785601B (en) Image segmentation method, system, medium and electronic terminal
CN111582057B (en) Face verification method based on local receptive field
CN115497092A (en) Image processing method, device and equipment
CN115050066A (en) Face counterfeiting detection method, device, terminal and storage medium
CN113033256B (en) Training method and device for fingertip detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 266400 No. 77, Lingyan Road, LINGSHANWEI sub district office, Huangdao District, Qingdao City, Shandong Province

Applicant after: Issa Technology Co.,Ltd.

Applicant after: QINGDAO YISA DATA TECHNOLOGY Co.,Ltd.

Address before: 266400 No. 77, Lingyan Road, LINGSHANWEI sub district office, Huangdao District, Qingdao City, Shandong Province

Applicant before: Qingdao Issa Technology Co.,Ltd.

Applicant before: QINGDAO YISA DATA TECHNOLOGY Co.,Ltd.

Address after: 266400 No. 77, Lingyan Road, LINGSHANWEI sub district office, Huangdao District, Qingdao City, Shandong Province

Applicant after: Qingdao Issa Technology Co.,Ltd.

Applicant after: QINGDAO YISA DATA TECHNOLOGY Co.,Ltd.

Address before: 100020 room 108, 1 / F, building 17, yard 6, Jingshun East Street, Chaoyang District, Beijing

Applicant before: BEIJING YISA TECHNOLOGY Co.,Ltd.

Applicant before: QINGDAO YISA DATA TECHNOLOGY Co.,Ltd.