CN112907626A - Moving object extraction method based on satellite time-exceeding phase data multi-source information - Google Patents

Moving object extraction method based on satellite time-exceeding phase data multi-source information Download PDF

Info

Publication number
CN112907626A
CN112907626A CN202110172481.4A CN202110172481A CN112907626A CN 112907626 A CN112907626 A CN 112907626A CN 202110172481 A CN202110172481 A CN 202110172481A CN 112907626 A CN112907626 A CN 112907626A
Authority
CN
China
Prior art keywords
image
target
frame
moving
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110172481.4A
Other languages
Chinese (zh)
Inventor
鹿明
李峰
辛蕾
杨雪
鲁啸天
张南
任志聪
肖化超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Space Technology CAST
Original Assignee
China Academy of Space Technology CAST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Space Technology CAST filed Critical China Academy of Space Technology CAST
Priority to CN202110172481.4A priority Critical patent/CN112907626A/en
Publication of CN112907626A publication Critical patent/CN112907626A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a moving target extraction method based on satellite time-exceeding phase data multi-source information, which comprises the following steps: a. collecting an overtime phase image of an area, and preprocessing the image; b. preliminarily extracting a moving target in the image, and extracting a road target on which the moving target depends in the image; c. further extraction of the moving target in the image is completed according to the preliminarily extracted moving target and the road target; d. and c, performing morphological processing on the target result image extracted in the step c to obtain a final result. The method extracts the information such as the space geometry, the motion speed, the road environment and the like of the moving target on the basis of the characteristics such as the spectrum, the texture, the time sequence and the like of the video satellite, and the information is jointly used for detecting the moving target, so that the false moving target caused by parallax and the false moving target caused by registration error and random noise are avoided, the accuracy of detecting the moving target is effectively improved, and the false detection rate is reduced.

Description

Moving object extraction method based on satellite time-exceeding phase data multi-source information
Technical Field
The invention relates to a moving target extraction method based on satellite time-exceeding phase data multi-source information.
Background
The moving target detection is a technology formed by cross fusion in the fields of computer vision, remote sensing image processing, artificial intelligence and the like, and is an important research content of situation awareness. The method can not only sense the existence of objects, but also sense the dynamic change trend of the objects. The method plays an extremely important role in the aspects of real-time investigation, real-time monitoring, real-time control and the like in the military and civil fields. The video satellite is a novel earth observation technology, can capture continuous images from a moving satellite platform, and provides a reliable data source for moving target detection and situation perception. The system has the characteristics of large-area observation, high spatial resolution, continuous video imaging and the like of the video satellite, so that the real-time dynamic information of the earth surface can be rapidly acquired.
Due to the relative motion of the satellite platform and the earth surface, the different frames of the video have deformation such as translation, rotation, distortion and stretching caused by the factors such as the height and the like of the earth surface. Therefore, objects such as high-rise buildings and iron towers can present obvious pseudo motion characteristics. If moving objects of a video satellite are directly detected by using moving object detection algorithms such as a traditional inter-frame difference method, a background modeling method, an optical flow method and the like, the high-level objects may be wrongly judged as moving objects. And, it is difficult to identify the false motion of the high object caused by the edge of the ground object and the parallax change generated by the image translation only by means of simple improvement of the traditional method.
Disclosure of Invention
The invention aims to provide a moving object extraction method based on satellite time-exceeding phase data multi-source information.
In order to achieve the above object, the present invention provides a moving object extraction method based on satellite hyper-temporal data multi-source information, comprising the following steps:
a. collecting an overtime phase image of an area, and preprocessing the image;
b. preliminarily extracting a moving target in the image, and extracting a road target on which the moving target depends in the image;
c. further extraction of the moving target in the image is completed according to the preliminarily extracted moving target and the road target;
d. and (c) performing morphological processing on the target result image extracted in the step (c) to obtain a final result.
According to an aspect of the present invention, in the step (a), a current frame image is collected and one frame image is collected in a period of time before and after the current frame, so as to form a three-frame sequence of a previous frame, the current frame and a next frame;
the time interval of three-frame image acquisition is determined by the speed and the length of a moving object and the frame frequency of the video, and the motion amplitude of the moving object between the extracted adjacent frames is between 10m and 100 m.
According to one aspect of the invention, the preprocessing in the step (a) is to perform interframe registration on a frame of image extracted before and after the current frame by taking the current frame as a reference;
the inter-frame registration comprises the steps of reading a current frame image and a frame image before or after the current frame image into an array, and respectively carrying out key point detection and feature description on the two frame images by adopting an SIFT (Scale invariant feature transform), SURF (speeded up robust features), ORB (object-oriented features) or AKAZE (AKAZE) algorithm;
performing feature matching on the key points on the two frames of images by using a matcher, wherein the matching method comprises the steps of calculating the distance of descriptors between each pair of key points and returning the minimum distance of k optimal matches with each key point;
and calculating a homography matrix of the two image transformations according to the matching point pairs, carrying out image deformation on a frame of image before or after the current frame, and removing abnormal point pairs by adopting an RANSAC algorithm during deformation.
According to an aspect of the present invention, in the step (b), the moving object in the image is preliminarily extracted in such a manner that the moving object is extracted based on a velocity attribute and a time-series attribute of the moving object, respectively;
when a moving target is extracted based on the speed attribute of the moving target, carrying out dense optical flow calculation on a current frame and a frame image before the current frame by using an optical flow method to obtain the optical flow state of each pixel, wherein the speed and the direction of the pixel are unchanged, the pixel is taken as a background, and otherwise, the pixel is taken as a foreground target;
when a moving target is extracted based on the time sequence attribute of the moving target, preliminarily dividing a foreground target and a background target by combining the motility characteristics of the moving target on a time sequence by using a three-frame difference method;
and extracting the road object depending on the moving object on the image by using a D _ LinkNet network based on deep learning.
According to one aspect of the invention, when a moving target is extracted based on the speed attribute of the moving target, a previous frame image and a current frame image which need to be subjected to optical flow calculation are sequentially input, and the image proportion is specified to construct a pyramid for each image;
determining the number of layers of a pyramid, the size of an average window, the iteration times of an algorithm in each layer of an image pyramid, the number of adjacent pixel points expanded by a calculation polynomial at each pixel point, a Gaussian standard deviation for smoothing a derivative and initial flow approximation;
converting the calculated optical flow from a Cartesian coordinate system to a polar coordinate system, and acquiring the speed and direction of each pixel point;
and according to the speed and the direction of the optical flow calculation, representing that the speed and the direction values of the pixel points are both 0 as a background target, and otherwise, representing as a foreground target.
According to an aspect of the present invention, when the moving object is extracted based on the time series property of the moving object in step (b), all of the three collected frames of images are read, and the images are converted from RGB images into grayscale images;
performing inter-frame difference on the gray-scale image of the current frame and the gray-scale images of the frames before and after the current frame to obtain two difference value images;
setting a threshold value to carry out binarization on the two difference images respectively to obtain two binary images which distinguish foreground objects from background objects;
and (4) performing AND operation on the two binary images, and extracting a moving object in the intersected image.
According to one aspect of the invention, when extracting the road target on which the moving target depends in step (b), firstly, a sample data set for network training and testing is constructed for generating a road extraction network;
selecting remote sensing images of some target satellites, and labeling road targets and other targets in the images;
dividing a data set into a training set, a verification set and a test set according to a certain proportion, wherein the training set is used for carrying out iterative training on network parameters, and the verification set is used for verifying whether a trained model can reach the expected precision;
and (4) inputting the current frame image as a test data set into the trained road extraction network to obtain a final road segmentation result.
According to an aspect of the present invention, the road extraction network is a U-type network based on an encoder-bridge-decoder structure;
the encoder part in the network is a ResNet34, and the bridge part is five volume blocks;
the decoder part is the inverse operation of the encoder part, and adopts up-sampling and is overlapped with the encoder part at the same level.
According to an aspect of the present invention, in the step (c), the three types of results extracted in the step (b) are respectively subjected to binary storage of 0 and 1, wherein 0 represents a background target and 1 represents a foreground;
and extracting the targets with the numerical values of 1 after binarization storage to serve as an accurate extraction result of the moving targets.
According to an aspect of the present invention, the morphological processing in step (d) is to perform a morphological opening operation on the image extracted from the final target by using a 3 × 3 circular template structure to eliminate the spots in the image;
adopting a 3 x 3 circular template structure to perform shape closing operation on the image subjected to shape opening operation so as to eliminate holes in the image;
the method also comprises the steps of carrying out connectivity analysis on the images subjected to morphological processing, and extracting a final moving target according to the following rules:
the vehicle target is determined when the target size is between 4 pixels and 2000 pixels, the target aspect ratio is below 8, the ratio of the target area to the minimum circumscribed rectangle area is greater than 0.2, and the average pixel value of the target is between 10 and 250.
According to the invention, on the basis of the characteristics of the spectrum, the texture, the time sequence and the like of the video satellite, the information of the space geometry, the motion speed, the road environment and the like of the moving target is extracted and is jointly used for detecting the moving target, so that the false moving target caused by parallax and the false moving target caused by registration error and random noise are avoided, the accuracy of the detection of the moving target is effectively improved, and the false detection rate is reduced.
According to one scheme of the invention, a deep learning algorithm is used for extracting road objects on which moving objects depend from the image. Therefore, objects which are mistaken for moving targets in the preliminary moving target extraction and have pseudo-moving characteristics can be screened out, the moving targets of the video satellites can be extracted more accurately, and the defects in the prior art are overcome.
According to one scheme of the invention, the moving objects in the image are respectively extracted by using an optical flow method and an improved three-frame difference method, and then the extraction results of the two methods and the extraction result of the road object are subjected to superposition analysis, so that the two methods are mutually corrected, and the accuracy of the final extraction of the moving objects is further improved.
According to one scheme of the invention, the acquired image is preprocessed before the moving target and the road target are extracted. The preprocessing step mainly comprises the step of performing interframe registration on two frames of images before and after the current frame by using the current frame. The subsequent target extraction is prevented from being influenced by the phenomenon of image distortion caused by satellite jitter and the like.
Drawings
FIG. 1 is a flow diagram schematically illustrating a moving object extraction method based on satellite overtime phase data multi-source information, according to one embodiment of the present invention;
FIG. 2 is a schematic diagram showing best matching keypoints pairs for a current frame and a previous frame during image registration;
FIG. 3 is a schematic diagram illustrating the velocity of a moving object obtained by an optical flow method;
FIG. 4 is a schematic diagram of a moving object obtained by an improved three-frame difference method;
FIG. 5 is a schematic diagram illustrating a road object (road) extracted based on a deep learning method;
fig. 6 schematically shows a current frame (left) and after final object extraction (right) on the current frame.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
The present invention is described in detail below with reference to the drawings and the specific embodiments, which are not repeated herein, but the embodiments of the present invention are not limited to the following embodiments.
Referring to fig. 1, in the moving object extraction method based on satellite hyper-temporal data multi-source information, a hyper-temporal image (or referred to as a multi-temporal image) of a certain region is collected first, and the hyper-temporal image is image data (which can also be understood as video data) with continuous time series. And then, preprocessing the acquired image by using a data preprocessing module (image reading can also be completed by the data preprocessing module), and after preprocessing, primarily extracting a moving target in the image by using a target information extraction module. Considering that a high object may be misjudged as a moving object, according to the concept of the present invention, a road object on which the moving object depends in an image is additionally extracted based on the background environment-dependent attribute of the moving object. Taking a vehicle and a road as examples, the vehicle is a moving object, and the road is a road object on which the vehicle depends. The preliminarily extracted moving object may include both a vehicle and a high-rise building. However, these moving objects are only moving objects that should be extracted when they appear on the road. Therefore, the invention additionally considers the environment dependence attribute of the moving target and further extracts the image on the basis of initial extraction, thereby eliminating the defect that the object with false motion characteristics, such as a high-rise building or an iron tower, is mistakenly judged as the moving target in the prior art.
The method of the present invention will be described in detail below with reference to the video 03 star Jilin No. I in the airport area in Atlanta as an embodiment shown in FIG. 1.
When the invention collects the image, the collected overtime phase image comprises the current frame and collects a frame of image in a period of time before and after the current frame respectively to form a three-frame sequence of the previous frame, the current frame and the next frame. The time interval of data reading is determined by the speed and the length of the target to be extracted and the frame frequency of the video, so that the moving target is not overlapped between adjacent frames to the maximum extent. In particular, for the three frames of images, the acquisition time interval between adjacent frames should ensure that the motion amplitude of the moving object is between 10m and 100m, so as to avoid the overlapping of the moving object between different frames. Specifically, the previous frame, the current frame, and the subsequent frame may be sequentially read. And initializing the first two frames of images, and reading the next frame of image as a frame after the current frame. For the present embodiment, the frame rate of the Jilin A video satellite is 10 frames/s, the ground vehicle motion speed is typically between 20-160m/s, and the ground vehicle target size is typically between 3-15 m. Therefore, in the present embodiment, 0.5s is set as the time interval of image capturing (or image reading), so that most vehicle targets can be detected. The presence of only a very small number of very long trucks and vehicles with very low motion speeds, which can also be eliminated in the post-processing, can cause cavitation due to the differences. The three frames of images with the time interval of the selected image determined together according to the speed of the moving object and the data frame rate of the overtime phase are not continuous in time.
The idea of the image preprocessing step of the invention is to take the current frame as a reference and register (transform) two frames of images extracted before and after the current frame in sequence. Therefore, the phenomenon that subsequent target extraction is influenced by image distortion and the like caused by factors such as self-shaking of the satellite can be avoided. Specifically, referring to fig. 2, a current frame and a previous frame are read into an array, and the two frames of images are respectively subjected to key point detection and feature description by using the algorithms of SIFT, SURF, ORB, AKAZE, and the like. In the embodiment, an AKAZE algorithm is used, and 644 key points are identified and screened. When the key points of the two images are identified, the matcher is used for carrying out feature matching on the key points on the two images, and the matching method is to calculate the distance of the descriptor between each pair of key points and return the minimum distance of the k best matches with each key point. Specifically, the distance between the descriptors of each pair of keypoints is measured to obtain k best matching keypoints pairs with the minimum distance from each keypoint. As shown in fig. 2, in the present embodiment, a total of 515 best matching key point pairs are returned. And then, calculating homography matrixes (namely Homographies) of the two image transformations according to the matching point pairs, and carrying out image deformation on one frame of image before the current frame. In the present embodiment, in order to ensure the optimal deformation effect, the RANSAC algorithm is used to remove the abnormal point pairs during deformation. And obtaining the registered previous frame image after the final deformation is finished. Then the current frame and the frame image after the current frame are read into the array, and the registered frame image after the current frame and the frame image after the current frame can also be obtained by the same steps.
The steps complete the inter-frame registration work (namely preprocessing) of the image, and can ensure that the object in the image is not distorted to influence the subsequent detection and identification processes. And then, preliminarily extracting the moving target to be extracted from the image, wherein the moving target is extracted by adopting two methods, namely extracting the moving target based on the speed attribute and the time sequence attribute of the moving target. When the moving target is extracted based on the speed attribute of the moving target, dense optical flow calculation is carried out on the current frame and a frame image before the current frame by using an optical flow method, the optical flow state of each pixel is obtained, the speed and the direction of the pixel are unchanged, and the pixel is regarded as a background, otherwise, the pixel is regarded as a foreground target. In this way, the moving speed of the target on the pixel level is obtained, and therefore the speed measurement of the target is realized. When the moving target is extracted based on the time sequence attribute of the moving target, the preliminary division of the foreground target and the background target is carried out by utilizing an improved three-frame difference method and combining the motility characteristics of the target on the time sequence.
When extracting a moving target using the optical flow method, it should be based on an image gradient constancy assumption and a local optical flow constancy assumption. That is, the luminance does not change when the same point changes with time. This is an assumption of basic optical flow, and all optical flow variants must be satisfied to derive the optical flow equations. In addition, it should be ensured that the time variation does not cause a drastic change in position, so that the grey scale can only partially derive the position. In this way, the partial derivative of gray level with respect to position can be approximated by the gray level change caused by the unit position change between the previous and subsequent frames. In the embodiment, a Gunnar Farneback algorithm is adopted to calculate the dense optical flow, all points on the image are matched point by the method, the offset of all points is calculated, an optical flow field is obtained, and then registration is carried out. In the present embodiment, the previous frame image and the current frame image that require optical flow calculation are sequentially input, the image ratio is specified to be 0.5, and a (image) pyramid is constructed for each image. Determining the number of layers of the pyramid to be 3, determining the size of an average window to be 12, determining the number of iterations of the algorithm in each layer of the image pyramid to be 3, and determining the number of adjacent pixel points which are expanded by the calculation polynomial at each pixel point to be 7. The gaussian standard deviation for smoothing the derivatives is then determined and used as the basis for the polynomial expansion, which the present embodiment sets to 1.5. In addition, the present embodiment uses the input flow as the initial flow approximation, although in other embodiments a Gaussian filter may be used.
And after the setting is completed, converting the calculated optical flow from a Cartesian coordinate system to a polar coordinate system, and acquiring the speed and the direction of the moving object of each pixel point. As shown in fig. 3, the speed of the light flow graph is higher as the brightness is higher, and then a certain threshold value can be set to determine the moving object in the light flow graph. For example, if the velocity value is 0, 1, 2, 3, … …, a point equal to or greater than 1 can be regarded as a moving object in the optical flow graph, that is, the threshold value can be set to 1. Therefore, the moving target, namely the target with the extraction speed being more than or equal to 1, can be preliminarily extracted according to the speed attribute of the moving target. Specifically, the direction of the moving object may be characterized by color. And according to the speed and the direction of the optical flow calculation, representing that the speed and the direction values of the pixel points are both 0 as a background target, and otherwise, representing as a foreground target.
When a moving target is extracted by using a three-frame difference method, three frames of images acquired in the image acquisition step are read. The images are converted into gray-scale images from RGB images, and the gray-scale image of the current frame and the gray-scale images of the frames before and after the current frame are subjected to inter-frame difference to obtain two difference value images. Then, a threshold value can be set, and the difference maps of the two frames are respectively binarized to obtain two binary maps. In the present embodiment, a threshold value is set to 40, which is an image gradation, and becomes brighter as the value thereof becomes higher and darker as the value thereof becomes lower, that is, a pixel point becomes larger than this threshold value at the time of binarization to be 1, and conversely to be 0. Therefore, the foreground (namely, a bright spot area) and the background (namely, a black area) can be distinguished, and the moving object extracted by the current frame and the images of the frames before and after the current frame is obtained. Subsequently, the two binary maps need to be subjected to an and operation (i.e. intersection) which may be understood as finding an intersection (also called overlay analysis) of the two frames of binary maps, for example, a point which is simultaneously 1 is a moving object (bright point), so that the moving object in the image can be preliminarily extracted from the intersected binary maps, as shown in fig. 4.
Through the steps, namely, the optical flow method and the three-frame difference method are respectively utilized to preliminarily extract the moving object in the image according to the speed attribute and the time series attribute of the moving object. According to the above concept of the present invention, it is also necessary to extract a road object from which a moving object comes. In contrast, the present invention extracts a target-dependent environment element (i.e., a road target) by using a target (road) recognition method based on deep learning, and the extraction of the ground moving target-dependent environment mainly refers to the extraction of a road target (vehicle) from a high-resolution satellite remote sensing image. The invention adopts the D _ LinkNet network to extract the road target. For this, a road extraction network is first constructed. In the present embodiment, the vehicle is used as the moving object, and the road is used as the road object. Therefore, the road extraction network is a U-shaped road extraction network based on an encoder-bridge-decoder structure. The encoder part of the network is a ResNet34, the bridge part is five (regular) volume blocks, and the decoder part is the inverse operation of the encoder part. And the decoder part adopts a mode of upsampling and overlapping with the encoder part at the same level, thereby realizing the feature fusion of different spatial scales. When the road extraction network is constructed, firstly, a sample data set of network training and testing is used for generating the road extraction network. Specifically, some representative remote sensing images of some target satellites need to be selected in a targeted manner. Representative data may be selected by a gray scale attribute, such as a typical gray scale value of a road object at different times in the history data. The data set constructed by the images can ensure that the trained model can identify various road targets at all times. Of course, the characteristics of the target object can be reflected in any case according to other attributes of the target object, such as space, spectrum and the like, so that the model identification is facilitated. After a representative image is selected, road targets and other targets in the representative image need to be labeled (data set), the data set is divided into a training set, a verification set and a test set according to a certain proportion, the training set is used for carrying out iterative training on network parameters, and the verification set is used for verifying whether a trained model can reach the expected precision. Inputting the training set into a road extraction network, and carrying out iterative training on network parameters until the network achieves ideal loss and precision on the training set and the verification set. Finally, the image frame (i.e., the current frame) of the video satellite is used as a test data set, the trained network is input, and a final road segmentation result, i.e., a road map (i.e., an extraction result) in the image corresponding to the current frame, is output, as shown in fig. 5.
Therefore, the moving target extracted in two modes and the road target obtained by the deep learning algorithm can be obtained according to the steps. The three extracted targets should be subjected to superposition analysis, i.e. the final extraction of the moving target is completed by using the extraction results of the three modes together. Specifically, the three types of results, namely the moving object extracted by the optical flow method and the three-frame difference method and the extraction result (figure) of the road object extracted based on the deep learning, are subjected to binary storage of 0 and 1 to form a binary image, and the moving object is stored as 1 (namely, a bright spot) in the storage process, namely, 0 represents a background object, and 1 represents a foreground. Therefore, the targets with the numerical values being 1 after binarization storage are subjected to target extraction in the current frame image, and the accurate (further) extraction result of the moving target obtained by fusing multi-source information attributes can be obtained.
Thus, the present invention is essentially to extract a moving object from an image by using an optical flow method and a three-frame difference method, respectively, that is, to integrate an object velocity attribute obtained by the optical flow method and a foreground object obtained by the three-frame difference method. The two methods have advantages and disadvantages respectively, and the two methods are combined to enable the two methods to be mutually corrected, and then the most accurate moving target extraction result can be obtained by matching with the environment attribute consideration of the dependence of the moving target. Of course, although the method has basically completed the final extraction of the moving object, the extraction result is only the initial result and needs to be completed through the post-processing step.
Specifically, as shown in fig. 6, the data post-processing module is used to perform morphological processing on the image extracted by the target information, so as to eliminate the influence of small spots and small holes (i.e., random noise and image registration error) in the image on the integrity of the moving target (vehicle), and obtain a final result. Specifically, a 3-by-3 circular template structure is adopted to perform shape opening operation on the image, and small isolated points (namely spots) are eliminated. And then, adopting a 3-by-3 circular template structure to perform form closing operation on the image subjected to form opening operation, and removing the influence of the small holes on the integrity of the vehicle target. And performing connectivity analysis on the result, and extracting a desired moving target according to a certain rule, wherein the rule can be formulated according to the size, the aspect ratio, the area ratio of the target to the minimum circumscribed rectangle and the like of the target. For example, for a vehicle object, only those with a moving object size (dimension) >4 pels and at the same time <2000 pels, an object aspect ratio of ≦ 8, an object area to minimum bounding rectangle area ratio >0.2, an object average pixel value >10, or <250, will be considered a vehicle object. After the morphological analysis and the connectivity analysis, the result graph is stored to obtain the current frame image labeled with the final moving target extraction result.
In summary, the moving object detection technology based on the hyper-temporal phase data integrates the processes of inter-frame registration, optical flow calculation, three-frame difference, deep learning-based object dependent environment extraction, post-processing and the like of the hyper-temporal phase data on the basis of the traditional inter-frame difference and optical flow method, so that a whole set of moving object detection method based on the satellite hyper-temporal phase data is formed by comprehensively utilizing the speed attribute, the time sequence attribute, the spectral attribute and the spatial position attribute of the object, namely, the multi-dimensional attribute of the object is sensed from different angles. Compared with the traditional moving object detection method such as interframe difference, an optical flow method, a background modeling method and the like, the method is more systematized, has higher extraction precision, can further carry out more accurate moving object detection, and has higher practicability in the application of satellite overtime phase data. Therefore, the problems of low detection accuracy and high false detection rate when the current moving target detection algorithm is applied to satellite video data detection are solved, and more accurate detection of the moving target is realized. In conclusion, the method can be used for the research on the aspects of detection, tracking and the like of the moving target (ground moving vehicle target) of the space-based satellite time-exceeding phase data, has very good universality and higher extraction precision in the extraction of the moving target of the satellite time-exceeding phase data, has important significance for the development of the research on the analysis of vehicle flow based on video satellites, the positioning and tracking of the ground moving target and the like, and can be widely applied to the fields of intelligent transportation, smart cities, emergency rescue and the like.
The above description is only one embodiment of the present invention, and is not intended to limit the present invention, and it is apparent to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A moving object extraction method based on satellite hyper-temporal data multi-source information comprises the following steps:
a. collecting an overtime phase image of an area, and preprocessing the image;
b. preliminarily extracting a moving target in the image, and extracting a road target on which the moving target depends in the image;
c. further extraction of the moving target in the image is completed according to the preliminarily extracted moving target and the road target;
d. and (c) performing morphological processing on the target result image extracted in the step (c) to obtain a final result.
2. The method of claim 1, wherein in step (a), the current frame image is collected and one frame image is collected in a period of time before and after the current frame, so as to form a three-frame sequence of the previous frame, the current frame and the next frame;
the time interval of three-frame image acquisition is determined by the speed and the length of a moving object and the frame frequency of the video, and the motion amplitude of the moving object between the extracted adjacent frames is between 10m and 100 m.
3. The method according to claim 2, wherein the preprocessing in step (a) is to perform inter-frame registration on a frame of image extracted before and after the current frame by using the current frame as a reference;
the inter-frame registration comprises the steps of reading a current frame image and a frame image before or after the current frame image into an array, and respectively carrying out key point detection and feature description on the two frame images by adopting an SIFT (Scale invariant feature transform), SURF (speeded up robust features), ORB (object-oriented features) or AKAZE (AKAZE) algorithm;
performing feature matching on the key points on the two frames of images by using a matcher, wherein the matching method comprises the steps of calculating the distance of descriptors between each pair of key points and returning the minimum distance of k optimal matches with each key point;
and calculating a homography matrix of the two image transformations according to the matching point pairs, carrying out image deformation on a frame of image before or after the current frame, and removing abnormal point pairs by adopting an RANSAC algorithm during deformation.
4. The method according to claim 2, wherein in the step (b), the moving object in the image is preliminarily extracted in such a manner that the moving object is extracted based on a velocity attribute and a time-series attribute of the moving object, respectively;
when a moving target is extracted based on the speed attribute of the moving target, carrying out dense optical flow calculation on a current frame and a frame image before the current frame by using an optical flow method to obtain the optical flow state of each pixel, wherein the speed and the direction of the pixel are unchanged, the pixel is taken as a background, and otherwise, the pixel is taken as a foreground target;
when a moving target is extracted based on the time sequence attribute of the moving target, preliminarily dividing a foreground target and a background target by combining the motility characteristics of the moving target on a time sequence by using a three-frame difference method;
and extracting the road object depending on the moving object on the image by using a D _ LinkNet network based on deep learning.
5. The method according to claim 4, wherein when extracting the moving object based on the velocity attribute of the moving object, a previous frame image and a current frame image which need optical flow calculation are sequentially input, and an image proportion is specified to construct a pyramid for each image;
determining the number of layers of a pyramid, the size of an average window, the iteration times of an algorithm in each layer of an image pyramid, the number of adjacent pixel points expanded by a calculation polynomial at each pixel point, a Gaussian standard deviation for smoothing a derivative and initial flow approximation;
converting the calculated optical flow from a Cartesian coordinate system to a polar coordinate system, and acquiring the speed and direction of each pixel point;
and according to the speed and the direction of the optical flow calculation, representing that the speed and the direction values of the pixel points are both 0 as a background target, and otherwise, representing as a foreground target.
6. The method according to claim 4, wherein when the moving object is extracted based on the time series property of the moving object in the step (b), all the three collected frames of images are read, and the images are converted from RGB (red, green and blue) images into gray-scale images;
performing inter-frame difference on the gray-scale image of the current frame and the gray-scale images of the frames before and after the current frame to obtain two difference value images;
setting a threshold value to carry out binarization on the two difference images respectively to obtain two binary images which distinguish foreground objects from background objects;
and (4) performing AND operation on the two binary images, and extracting a moving object in the intersected image.
7. The method according to claim 5 or 6, wherein when extracting the road target on which the moving object depends in step (b), firstly constructing a sample data set of network training and testing for generating a road extraction network;
selecting remote sensing images of some target satellites, and labeling road targets and other targets in the images;
dividing a data set into a training set, a verification set and a test set according to a certain proportion, wherein the training set is used for carrying out iterative training on network parameters, and the verification set is used for verifying whether a trained model can reach the expected precision;
and (4) inputting the current frame image as a test data set into the trained road extraction network to obtain a final road segmentation result.
8. The method as claimed in claim 7, wherein the road extraction network is a U-type network based on an encoder-bridge-decoder structure;
the encoder part in the network is a ResNet34, and the bridge part is five volume blocks;
the decoder part is the inverse operation of the encoder part, and adopts up-sampling and is overlapped with the encoder part at the same level.
9. The method according to claim 8, wherein in the step (c), the three types of results extracted in the step (b) are respectively subjected to binary storage of 0 and 1, wherein 0 represents a background object and 1 represents a foreground;
and extracting the targets with the numerical values of 1 after binarization storage to serve as an accurate extraction result of the moving targets.
10. The method according to claim 1, wherein the morphological processing in step (d) is to perform a morphological opening operation on the image extracted from the final target to eliminate the spots in the image by using a 3-by-3 circular template structure;
adopting a 3 x 3 circular template structure to perform shape closing operation on the image subjected to shape opening operation so as to eliminate holes in the image;
the method also comprises the steps of carrying out connectivity analysis on the images subjected to morphological processing, and extracting a final moving target according to the following rules:
the vehicle target is determined when the target size is between 4 pixels and 2000 pixels, the target aspect ratio is below 8, the ratio of the target area to the minimum circumscribed rectangle area is greater than 0.2, and the average pixel value of the target is between 10 and 250.
CN202110172481.4A 2021-02-08 2021-02-08 Moving object extraction method based on satellite time-exceeding phase data multi-source information Pending CN112907626A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110172481.4A CN112907626A (en) 2021-02-08 2021-02-08 Moving object extraction method based on satellite time-exceeding phase data multi-source information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110172481.4A CN112907626A (en) 2021-02-08 2021-02-08 Moving object extraction method based on satellite time-exceeding phase data multi-source information

Publications (1)

Publication Number Publication Date
CN112907626A true CN112907626A (en) 2021-06-04

Family

ID=76122744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110172481.4A Pending CN112907626A (en) 2021-02-08 2021-02-08 Moving object extraction method based on satellite time-exceeding phase data multi-source information

Country Status (1)

Country Link
CN (1) CN112907626A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866500A (en) * 2022-04-01 2022-08-05 中国卫星海上测控部 Real-time optimization method for multi-source telemetering position control instruction
CN117236566A (en) * 2023-11-10 2023-12-15 山东顺发重工有限公司 Whole-process visual flange plate package management system
CN117648889A (en) * 2024-01-30 2024-03-05 中国石油集团川庆钻探工程有限公司 Method for measuring velocity of blowout fluid based on interframe difference method
CN118155005A (en) * 2024-05-13 2024-06-07 成都坤舆空间科技有限公司 Ecological restoration map spot matching classification method based on RAFT-Stereo algorithm

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005235104A (en) * 2004-02-23 2005-09-02 Jr Higashi Nippon Consultants Kk Mobile object detecting system, mobile object detecting device, mobile object detecting method, and mobile object detecting program
CN1885346A (en) * 2006-06-01 2006-12-27 电子科技大学 Detection method for moving target in infrared image sequence under complex background
US20110142283A1 (en) * 2009-12-10 2011-06-16 Chung-Hsien Huang Apparatus and method for moving object detection
CN102184550A (en) * 2011-05-04 2011-09-14 华中科技大学 Mobile platform ground movement object detection method
CN106683119A (en) * 2017-01-09 2017-05-17 河北工业大学 Moving vehicle detecting method based on aerially photographed video images
JP2018036848A (en) * 2016-08-31 2018-03-08 株式会社デンソーアイティーラボラトリ Object state estimation system, object state estimation device, object state estimation method and object state estimation program
CN109102523A (en) * 2018-07-13 2018-12-28 南京理工大学 A kind of moving object detection and tracking

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005235104A (en) * 2004-02-23 2005-09-02 Jr Higashi Nippon Consultants Kk Mobile object detecting system, mobile object detecting device, mobile object detecting method, and mobile object detecting program
CN1885346A (en) * 2006-06-01 2006-12-27 电子科技大学 Detection method for moving target in infrared image sequence under complex background
US20110142283A1 (en) * 2009-12-10 2011-06-16 Chung-Hsien Huang Apparatus and method for moving object detection
CN102184550A (en) * 2011-05-04 2011-09-14 华中科技大学 Mobile platform ground movement object detection method
JP2018036848A (en) * 2016-08-31 2018-03-08 株式会社デンソーアイティーラボラトリ Object state estimation system, object state estimation device, object state estimation method and object state estimation program
CN106683119A (en) * 2017-01-09 2017-05-17 河北工业大学 Moving vehicle detecting method based on aerially photographed video images
CN109102523A (en) * 2018-07-13 2018-12-28 南京理工大学 A kind of moving object detection and tracking

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BO DU等: "Object Tracking in Satellite Videos Based on a Multiframe Optical Flow Tracker", IEEE, 31 December 2019 (2019-12-31) *
刘鑫等: "四帧间差分与光流法结合的目标检测及追踪", 光电工程, 31 December 2018 (2018-12-31) *
李倩倩: "视频监控中运动车辆识别技术的研究", 信息科技辑, pages 3 - 5 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866500A (en) * 2022-04-01 2022-08-05 中国卫星海上测控部 Real-time optimization method for multi-source telemetering position control instruction
CN114866500B (en) * 2022-04-01 2024-04-23 中国卫星海上测控部 Real-time optimization method for multisource remote control commands
CN117236566A (en) * 2023-11-10 2023-12-15 山东顺发重工有限公司 Whole-process visual flange plate package management system
CN117236566B (en) * 2023-11-10 2024-02-06 山东顺发重工有限公司 Whole-process visual flange plate package management system
CN117648889A (en) * 2024-01-30 2024-03-05 中国石油集团川庆钻探工程有限公司 Method for measuring velocity of blowout fluid based on interframe difference method
CN117648889B (en) * 2024-01-30 2024-04-26 中国石油集团川庆钻探工程有限公司 Method for measuring velocity of blowout fluid based on interframe difference method
CN118155005A (en) * 2024-05-13 2024-06-07 成都坤舆空间科技有限公司 Ecological restoration map spot matching classification method based on RAFT-Stereo algorithm
CN118155005B (en) * 2024-05-13 2024-08-02 成都坤舆空间科技有限公司 Ecological restoration map spot matching classification method based on RAFT-Stereo algorithm

Similar Documents

Publication Publication Date Title
CN109615611B (en) Inspection image-based insulator self-explosion defect detection method
CN109255776B (en) Automatic identification method for cotter pin defect of power transmission line
CN112907626A (en) Moving object extraction method based on satellite time-exceeding phase data multi-source information
Piccioli et al. Robust method for road sign detection and recognition
CN106683119B (en) Moving vehicle detection method based on aerial video image
CN111325769B (en) Target object detection method and device
CN111563469A (en) Method and device for identifying irregular parking behaviors
CN106815583B (en) Method for positioning license plate of vehicle at night based on combination of MSER and SWT
CN107767400A (en) Remote sensing images sequence moving target detection method based on stratification significance analysis
CN110033431A (en) Non-contact detection device and detection method for detecting corrosion area on surface of steel bridge
CN113989604B (en) Tire DOT information identification method based on end-to-end deep learning
CN113326846B (en) Rapid bridge apparent disease detection method based on machine vision
CN113033385A (en) Deep learning-based violation building remote sensing identification method and system
CN115372990A (en) High-precision semantic map building method and device and unmanned vehicle
CN114332644B (en) Large-view-field traffic density acquisition method based on video satellite data
CN109063564B (en) Target change detection method
CN113642430A (en) High-precision visual positioning method and system for underground parking lot based on VGG + NetVLAD
CN117593548A (en) Visual SLAM method for removing dynamic feature points based on weighted attention mechanism
CN116503733B (en) Remote sensing image target detection method, device and storage medium
CN111402185B (en) Image detection method and device
CN116385477A (en) Tower image registration method based on image segmentation
CN112036246B (en) Construction method of remote sensing image classification model, remote sensing image classification method and system
CN115019306A (en) Embedding box label batch identification method and system based on deep learning and machine vision
CN114332814A (en) Parking frame identification method and device, electronic equipment and storage medium
Ren et al. Building recognition from aerial images combining segmentation and shadow

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination