CN114972422A - Image sequence motion occlusion detection method and device, memory and processor - Google Patents

Image sequence motion occlusion detection method and device, memory and processor Download PDF

Info

Publication number
CN114972422A
CN114972422A CN202210491032.0A CN202210491032A CN114972422A CN 114972422 A CN114972422 A CN 114972422A CN 202210491032 A CN202210491032 A CN 202210491032A CN 114972422 A CN114972422 A CN 114972422A
Authority
CN
China
Prior art keywords
occlusion
boundary
feature map
motion
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210491032.0A
Other languages
Chinese (zh)
Inventor
董冲
方挺
韩家明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University Of Technology Science Park Co ltd
Original Assignee
Anhui University Of Technology Science Park Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University Of Technology Science Park Co ltd filed Critical Anhui University Of Technology Science Park Co ltd
Priority to CN202210491032.0A priority Critical patent/CN114972422A/en
Publication of CN114972422A publication Critical patent/CN114972422A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The application discloses a method, a device, a memory and a processor for detecting the motion occlusion of an image sequence, wherein the method comprises the steps of obtaining any two continuous frames of images; acquiring a dense optical flow field and a motion boundary area between the two frames of images; and analyzing the dense optical flow field and the motion boundary region as input by using a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model. In the semantic segmentation depth neural network model, a multilayer accumulation loss function based on the information weight of the occlusion boundary space is adopted, and the neighborhood pixel space correlation of the occlusion boundary is embedded into the learning process, so that the network model can be converged to the details such as the motion occlusion boundary, and the like, and the constructed network model is suitable for motion occlusion detection and obtains the occlusion detection effect with clear boundary.

Description

Image sequence motion occlusion detection method and device, memory and processor
Technical Field
The application relates to a moving image sequence processing technology, in particular to a moving image sequence occlusion detection method based on a semantic segmentation deep neural network architecture.
Background
Image sequence motion occlusion refers to the phenomenon that a portion of pixels are visible in one frame of image and not visible in another frame of image. The method is an important task in the field of image processing and computer vision research, and aims to guide other computer vision tasks such as optical flow estimation, image registration, target segmentation, target tracking and the like to carry out accurate calculation by detecting occlusion areas between different objects and scenes or between different parts of different objects in an image sequence. The research results are widely applied to military science and technology, medical image processing and analysis, aerospace, satellite cloud picture analysis and the like.
The traditional image sequence motion occlusion detection method is to compare forward and backward motion estimation by utilizing motion symmetry or detect occlusion by establishing models such as geometric constraint, matching constraint and the like, but the methods have the problems of occlusion areas and occlusion boundary blurring when facing complex scenes or complex motion.
Disclosure of Invention
The embodiment of the application provides a method, a device, a memory and a processor for detecting motion occlusion of an image sequence, so as to at least solve the technical problems of occlusion areas and occlusion boundary blurring existing in motion occlusion of the image motion sequence.
According to an aspect of the present application, there is provided an image sequence motion occlusion detection method, including:
acquiring any two continuous frames of images;
acquiring a dense optical flow field and a motion boundary area between the two frames of images;
analyzing the dense optical flow field and the motion boundary region as input by using a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model;
wherein a loss value L at a k-th layer of a decoder in the semantically segmented deep neural network model k The following were used:
Figure BDA0003630942580000011
in the above formula, the parameters have the following meanings:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Represents said k x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents the occlusion label of each pixel x, taking 0 or 1;
ω (x) represents a weight, and
Figure BDA0003630942580000021
o is an occlusion area, B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
Further, in the present invention, the D (σ) is obtained by the following formula:
Figure BDA0003630942580000022
wherein:
d 1 (x) Is the shielding edgeDistance from pixels in the boundary region to the occlusion boundary;
d 2 (x) Is the distance of the point to the occlusion border region within the search window.
Further, in the present invention, the occlusion boundary area is obtained by:
obtaining an occlusion boundary from the real occlusion map;
performing mask expansion on the shielding boundary to obtain an expanded shielding area;
and subtracting the expanded occlusion area from the real occlusion image to obtain the occlusion boundary area.
Further, in the present invention, the loss value of the semantically segmented deep neural network model is
Figure BDA0003630942580000023
Wherein, ω is k Representing the weight of each layer of the occlusion prediction graph.
Further, in the present invention, ω is said k Each layer was taken to be the same.
Further, in the present invention, the structure of each layer of the decoder is as follows:
4 deconvolution modules stacked successively, wherein each deconvolution module is used for sequentially executing one deconvolution operation of 4 × 4 and two convolution operations of 7 × 7 to obtain a feature map after the deconvolution operation; after each convolution operation, performing normalization processing and activation processing once;
the splicing module is used for splicing the feature map generated by the corresponding layer of the encoder, the feature map obtained by the layer of the decoder after the deconvolution operation and the up-sampled occlusion feature map processed by the layer before the decoder to obtain a spliced feature map, and executing a 3 × 3 convolution operation on the spliced feature map to generate an occlusion feature map; the occlusion feature map is to be processed into an up-sampled occlusion feature map that is to be up-sampled in a next decoding layer by doubling a resolution via an up-sampling operation;
when the splicing module in the first layer of the decoder performs splicing to obtain a spliced feature map, splicing the feature map of the coding part with the feature map operated by the deconvolution module to obtain the spliced feature map.
Further, in the present invention, the acquiring a motion boundary region between the two images includes:
detecting a moving boundary of the dense optical flow field with an edge detector;
and expanding the motion boundary of the dense optical flow field by using an expansion mask to obtain a motion boundary region area.
In a second aspect of the present application, there is provided an image sequence motion occlusion detection apparatus, including:
the first acquisition module is used for acquiring any two continuous frames of images;
the second acquisition module is used for acquiring a dense optical flow field and a motion boundary between the two frames of images;
the analysis output module is used for analyzing the dense optical flow field and the motion boundary as input by utilizing a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model;
wherein a loss value L at a k-th layer of a decoder in the semantically segmented deep neural network model k The following were used:
Figure BDA0003630942580000031
in the above formula, the parameters have the following meanings:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Represents said k x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents the occlusion label for each pixel x, taking 0 or 1;
ω (x) represents a weight, and
Figure BDA0003630942580000032
o is an occlusion area, B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
In a third aspect of the present application, there is provided a memory for storing software for performing the method of the first aspect of the present application.
In a fourth aspect of the present application, there is provided a processor for processing software for performing the method of the first aspect of the present application.
Has the beneficial effects that:
the application provides an image sequence motion occlusion detection method, which comprises the steps of obtaining two continuous frames of images; acquiring a dense optical flow field and a motion boundary area between the two frames of images; and analyzing the dense optical flow field and the motion boundary region as input by using a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model. In the semantic segmentation depth neural network model, a multilayer accumulation loss function based on the information weight of the occlusion boundary space is adopted, and the neighborhood pixel space correlation of the occlusion boundary is embedded into the learning process, so that the network model can be converged to the details such as the motion occlusion boundary, and the like, and the constructed network model is suitable for motion occlusion detection and obtains the occlusion detection effect with clear boundary.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application, and the description of the exemplary embodiments of the application are intended to be illustrative of the application and are not intended to limit the application. In the drawings:
FIG. 1 is a flow chart of a method for detecting occlusion due to motion of an image sequence according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a semantic segmentation deep neural network model according to an embodiment of the present application;
FIG. 3 is a first frame picture of a sequence of bamboooo _1 pictures in an MPI _ Sintel dataset;
FIG. 4 is a second frame picture of a sequence of bamboo _1 pictures in the MPI _ Sintel dataset;
FIG. 5 is an occlusion diagram of a sequence of bamboooo _1 images in an MPI _ Sintel data set calculated by a method according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
The embodiment of the application provides an image sequence motion occlusion detection method, which creatively considers motion occlusion as semantic information among image sequences, adopts an encoder-decoder structure of a semantic segmentation depth neural network model to construct an occlusion detection neural network module, analyzes occlusion information in an image sequence optical flow field, and designs a loss function more fitting a motion occlusion scene, thereby realizing accurate detection of the motion occlusion.
As shown in fig. 1, a method for detecting occlusion due to motion in an image sequence according to an embodiment of the present invention is shown, and the method includes the following steps:
s102, acquiring any two continuous frames of images.
As shown in fig. 3 and 4, the 2 pictures provided by the embodiment of the present application are selected from the frame _0043 image and the frame _0044 image of the bamboooo _1 image sequence in the MPI _ sinter data set as the first frame image and the second frame image, and the grayscale image of the 2 frame image is presented in the drawing.
And S104, acquiring a dense optical flow field and a motion boundary area between the two frames of images.
In the embodiment, an optical flow convolution neural network is adopted to detect the dense optical flow field, and then a Sobel edge detector is used to detect the motion boundary of the dense optical flow field; and finally, expanding the motion boundary of the dense optical flow field by using an h x h expansion mask to obtain a motion boundary area.
And S106, analyzing the dense optical flow field and the motion boundary as input by using a semantic segmentation depth neural network model, and obtaining an occlusion detection result output by the semantic segmentation depth neural network model.
In this embodiment, as shown in fig. 2, the feature channel of the semantic segmentation deep neural network model is selected as 3 layers.
Wherein, the structure of each layer of the encoder is as follows:
4 convolution modules stacked successively, each convolution module being configured to sequentially perform a convolution operation by 3 × 3, wherein each convolution operation is followed by a normalization process and an activation process;
a pooling module to perform a 2 x 2 max pooling operation.
Under the effect of the above-mentioned encoder structure, the number of the characteristic channels is doubled in each down-sampling process, and is 16, 32, 64, 128 and 256 respectively.
Wherein, the structure of each layer of the decoder is as follows:
4 deconvolution modules stacked successively, wherein each deconvolution module is used for sequentially executing one deconvolution operation of 4 × 4 and two convolution operations of 7 × 7 to obtain a feature map after the deconvolution operation; wherein, normalization processing and activation processing are carried out once after each convolution operation;
the splicing module is used for splicing the feature map generated by the corresponding layer of the encoder, the feature map obtained by the layer of the decoder after the deconvolution operation and the up-sampled occlusion feature map processed by the layer before the decoder to obtain a spliced feature map, and executing a 3 × 3 convolution operation on the spliced feature map to generate an occlusion feature map; the occlusion feature map is to be processed into an up-sampled occlusion feature map that is to be up-sampled in a next decoding layer by doubling a resolution via an up-sampling operation;
when the splicing module in the first layer of the decoder performs splicing to obtain a spliced feature map, splicing the feature map of the coding part with the feature map operated by the deconvolution module to obtain the spliced feature map.
And each deconvolution module operation reduces the number of channels to be 256, 128, 64, 32, 16 and 1 respectively, wherein the last layer does not introduce a deconvolution module, and a single-channel occlusion feature map is generated by 3 multiplied by 3 convolution.
Occlusion detection is a two-class semantic problem, usually employing a loss function based on binary cross entropy to train a neural network. However, motion occlusion pixels in an image sequence generally have obvious sample skew, and when the number of non-occlusion pixels is much larger than that of occlusion pixels, the network loss value cannot well reflect the accuracy of occlusion pixel detection; meanwhile, the designed network needs to be well converged to the details such as the motion occlusion boundary. Based on the above two considerations, in this embodiment, a multilayer accumulated loss function based on the information weight of the occlusion boundary space is designed, and specifically, the loss value L at the kth layer of the decoder in the semantic segmentation deep neural network model k The following were used:
Figure BDA0003630942580000061
in the above formula, the parameters have the following meanings:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Indicating that k is to be signed by a Sigmoid function x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents an occlusion label for each pixel x, taking 0 or 1, for distinguishing whether it is occluded or not;
ω (x) represents a weight, and
Figure BDA0003630942580000062
o is an occlusion area, and B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
In the present embodiment, the D (σ) is obtained by the following formula:
Figure BDA0003630942580000063
wherein:
d 1 (x) The distance from the pixel in the occlusion boundary area to the occlusion boundary is obtained;
d 2 (x) Is the distance of the point to the occlusion border region within the search window.
The method is based on a semantic segmentation depth neural network architecture, improves the accuracy of a neural network model in detecting the occlusion area and the occlusion boundary by introducing motion boundary input and designing a multilayer accumulation loss function based on the occlusion boundary space information weight, has higher calculation precision and better adaptability to complex scenes and complex motion image sequences, and can be effectively applied to an image sequence motion analysis visual task.
In this embodiment, the occlusion boundary area is obtained by the following method:
obtaining an occlusion boundary from the real occlusion map;
performing mask expansion on the shielding boundary to obtain an expanded shielding area;
and subtracting the expanded occlusion region from the real occlusion image to obtain the occlusion boundary region.
The embodiment adopts supervised learning, the real occlusion map is a target in the machine learning, and the real occlusion map is also obtained from the MPI _ sinter data set. In the embodiment of the application, the occlusion boundary region is obtained according to the real occlusion graph and acts on the distribution of the weight, so that the method in the embodiment of the application can clearly express at the occlusion boundary.
In this embodiment, the occlusion real map is downsampled according to the size of each layer of occlusion prediction map, the loss function defined above is used, and finally the loss value of the semantic segmentation deep neural network model is obtained as
Figure BDA0003630942580000071
Wherein, ω is k Representing the weight of each layer of the occlusion prediction graph.
In this embodiment, ω is k The same homogeneous zone was taken for each layer 0.5.
According to the occlusion detection result in fig. 5, the method improves the accuracy of the motion occlusion detection of the image sequence, has higher motion occlusion detection precision for complex scenes and complex motion image sequences, and has wide application prospects in the fields of medical segmentation, video monitoring and the like.
According to a second aspect of the present application, there is provided an image sequence motion occlusion detection apparatus, comprising:
the first acquisition module is used for acquiring any two continuous frames of images;
the second acquisition module is used for acquiring a dense optical flow field and a motion boundary area between the two frames of images;
the analysis output module is used for analyzing the dense optical flow field and the motion boundary region as input by utilizing a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model;
wherein a loss value L at a k-th layer of a decoder in the semantically segmented deep neural network model k The following were used:
Figure BDA0003630942580000072
in the above formula, the meaning of each parameter is as follows:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Represents said k x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents the occlusion label for each pixel x, taking 0 or 1;
ω (x) represents a weight, and
Figure BDA0003630942580000073
o is an occlusion area, B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
According to yet another aspect of the application, a processor is provided for executing software for executing the method for detecting occlusion in motion of an image sequence.
According to yet another aspect of the present application, a memory is provided for storing software for executing the method for detecting occlusion in motion of an image sequence.
It should be noted that, the method for detecting motion occlusion in an image sequence executed by the software is the same as the method for detecting motion occlusion in an image sequence described above, and is not described herein again.
In this embodiment, an electronic device is provided, comprising a memory in which a computer program is stored and a processor configured to run the computer program to perform the method in the above embodiments.
These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.
The programs described above may be run on a processor or may also be stored in memory (or referred to as computer-readable media), which includes both non-transitory and non-transitory, removable and non-removable media, that implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.

Claims (10)

1. The image sequence motion occlusion detection method is characterized by comprising the following steps:
acquiring any two continuous frame images;
acquiring a dense optical flow field and a motion boundary area between the two frames of images;
analyzing the dense optical flow field and the motion boundary region as input by using a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model;
wherein a loss value L at a k-th layer of a decoder in the semantically segmented deep neural network model k The following were used:
Figure FDA0003630942570000011
in the above formula, the parameters have the following meanings:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Represents said k x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents the occlusion label for each pixel x, taking 0 or 1;
ω (x) represents a weight, and
Figure FDA0003630942570000012
o is an occlusion area, B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
2. The method of claim 1, wherein: the D (σ) is obtained by the following formula:
Figure FDA0003630942570000013
wherein:
d 1 (x)the distance from the pixel in the occlusion boundary area to the occlusion boundary is obtained;
d 2 (x) Is the distance of the point to the occlusion border region within the search window.
3. The method of claim 1, wherein: the occlusion boundary area is obtained by the following method:
obtaining an occlusion boundary from the real occlusion map;
performing mask expansion on the shielding boundary to obtain an expanded shielding area;
and subtracting the expanded occlusion region from the real occlusion image to obtain the occlusion boundary region.
4. The method of claim 1, wherein: the loss value of the semantic segmentation deep neural network model is
Figure FDA0003630942570000021
Wherein, ω is k And representing the weight of each layer of the occlusion prediction graph.
5. The method of claim 4, wherein: the omega k Each layer was taken to be the same.
6. The method of claim 5, wherein: the structure of each layer of the decoder is as follows:
4 deconvolution modules stacked successively, wherein each deconvolution module is used for sequentially executing one deconvolution operation of 4 × 4 and two convolution operations of 7 × 7 to obtain a feature map after the deconvolution operation; wherein, normalization processing and activation processing are carried out once after each convolution operation;
the splicing module is used for splicing the feature map generated by the corresponding layer of the encoder, the feature map obtained by the layer of the decoder after the deconvolution operation and the up-sampled occlusion feature map processed by the layer before the decoder to obtain a spliced feature map, and executing a 3 × 3 convolution operation on the spliced feature map to generate an occlusion feature map; the occlusion feature map is to be processed into an up-sampled occlusion feature map that is to be up-sampled in a next decoding layer by doubling a resolution via an up-sampling operation;
when the splicing module in the first layer of the decoder performs splicing to obtain a spliced feature map, splicing the feature map of the coding part with the feature map operated by the deconvolution module to obtain the spliced feature map.
7. The method according to any one of claims 1 to 6, characterized in that: the acquiring a motion boundary region between the two frame images comprises:
detecting a moving boundary of the dense optical flow field with an edge detector;
and expanding the motion boundary of the dense optical flow field by using an expansion mask to obtain a motion boundary area.
8. Image sequence motion shelters from detection device which characterized in that: the method comprises the following steps:
the first acquisition module is used for acquiring any two continuous frames of images;
the second acquisition module is used for acquiring a dense optical flow field and a motion boundary area between the two frames of images;
the analysis output module is used for analyzing the dense optical flow field and the motion boundary region as input by utilizing a semantic segmentation deep neural network model to obtain an occlusion detection result output by the semantic segmentation deep neural network model;
wherein a loss value L at a k-th layer of a decoder in the semantically segmented deep neural network model k The following were used:
Figure FDA0003630942570000022
in the above formula, the parameters have the following meanings:
x denotes pixel coordinates, and Ω denotes a real number domain;
k x a predicted value in each channel of the occlusion feature map input for the last layer of the decoder;
a(k x ) Represents said k x Mapping to the (0, 1) interval to form an activation value of a shielding mapping value;
o (x) represents the occlusion label for each pixel x, taking 0 or 1;
ω (x) represents a weight, and
Figure FDA0003630942570000031
o is an occlusion area, B is an occlusion boundary area;
ω 0 (x) Is the occlusion region weight;
ω b an initial weight of the occlusion boundary region;
d (σ) is a distance function based on the search window radius σ.
9. A memory for storing software, characterized in that the software is adapted to perform the method of any of claims 1-7.
10. A processor for processing software, characterized in that the software is adapted to perform the method of any of claims 1-7.
CN202210491032.0A 2022-05-07 2022-05-07 Image sequence motion occlusion detection method and device, memory and processor Pending CN114972422A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210491032.0A CN114972422A (en) 2022-05-07 2022-05-07 Image sequence motion occlusion detection method and device, memory and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210491032.0A CN114972422A (en) 2022-05-07 2022-05-07 Image sequence motion occlusion detection method and device, memory and processor

Publications (1)

Publication Number Publication Date
CN114972422A true CN114972422A (en) 2022-08-30

Family

ID=82980963

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210491032.0A Pending CN114972422A (en) 2022-05-07 2022-05-07 Image sequence motion occlusion detection method and device, memory and processor

Country Status (1)

Country Link
CN (1) CN114972422A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200084427A1 (en) * 2018-09-12 2020-03-12 Nvidia Corporation Scene flow estimation using shared features
CN110992367A (en) * 2019-10-31 2020-04-10 北京交通大学 Method for performing semantic segmentation on image with shielding area
CN111401308A (en) * 2020-04-08 2020-07-10 蚌埠学院 Fish behavior video identification method based on optical flow effect
CN112347852A (en) * 2020-10-10 2021-02-09 上海交通大学 Target tracking and semantic segmentation method and device for sports video and plug-in
CN113888604A (en) * 2021-09-27 2022-01-04 安徽清新互联信息科技有限公司 Target tracking method based on depth optical flow
US20220101539A1 (en) * 2020-09-30 2022-03-31 Qualcomm Incorporated Sparse optical flow estimation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200084427A1 (en) * 2018-09-12 2020-03-12 Nvidia Corporation Scene flow estimation using shared features
CN110992367A (en) * 2019-10-31 2020-04-10 北京交通大学 Method for performing semantic segmentation on image with shielding area
CN111401308A (en) * 2020-04-08 2020-07-10 蚌埠学院 Fish behavior video identification method based on optical flow effect
US20220101539A1 (en) * 2020-09-30 2022-03-31 Qualcomm Incorporated Sparse optical flow estimation
CN112347852A (en) * 2020-10-10 2021-02-09 上海交通大学 Target tracking and semantic segmentation method and device for sports video and plug-in
CN113888604A (en) * 2021-09-27 2022-01-04 安徽清新互联信息科技有限公司 Target tracking method based on depth optical flow

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIU YU 等: ""Better Dense Trajectories by Motion in Videos"", 《 IEEE TRANSACTIONS ON CYBERNETICS》, vol. 49, no. 1, 28 November 2017 (2017-11-28), pages 159 - 170, XP011700745, DOI: 10.1109/TCYB.2017.2769097 *
葛利跃 等: ""基于运动优化语义分割的变分光流计算方法"", 《模式识别与人工智能》, vol. 34, no. 7, 15 July 2021 (2021-07-15), pages 631 - 645 *

Similar Documents

Publication Publication Date Title
Dolson et al. Upsampling range data in dynamic environments
US8755563B2 (en) Target detecting method and apparatus
JP2008518331A (en) Understanding video content through real-time video motion analysis
CN112329702B (en) Method and device for rapid face density prediction and face detection, electronic equipment and storage medium
CN109300151B (en) Image processing method and device and electronic equipment
CN109377499B (en) Pixel-level object segmentation method and device
US20170018106A1 (en) Method and device for processing a picture
CN111311611B (en) Real-time three-dimensional large-scene multi-object instance segmentation method
CN111986472B (en) Vehicle speed determining method and vehicle
WO2016120132A1 (en) Method and apparatus for generating an initial superpixel label map for an image
CN111382647B (en) Picture processing method, device, equipment and storage medium
CN111415300A (en) Splicing method and system for panoramic image
CN113269722A (en) Training method for generating countermeasure network and high-resolution image reconstruction method
Patil et al. End-to-end recurrent generative adversarial network for traffic and surveillance applications
Kim et al. High-quality depth map up-sampling robust to edge noise of range sensors
CN112465029A (en) Instance tracking method and device
CN112906614A (en) Pedestrian re-identification method and device based on attention guidance and storage medium
US9659372B2 (en) Video disparity estimate space-time refinement method and codec
CN111260686B (en) Target tracking method and system for anti-shielding multi-feature fusion of self-adaptive cosine window
CN111881914A (en) License plate character segmentation method and system based on self-learning threshold
Lee et al. Integrating wavelet transformation with Markov random field analysis for the depth estimation of light‐field images
CN116468968A (en) Astronomical image small target detection method integrating attention mechanism
CN114972422A (en) Image sequence motion occlusion detection method and device, memory and processor
EP4235492A1 (en) A computer-implemented method, data processing apparatus and computer program for object detection
Dong et al. Monocular visual-IMU odometry using multi-channel image patch exemplars

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination