CN112307943A - Water area man-boat target detection method, system, terminal and medium - Google Patents

Water area man-boat target detection method, system, terminal and medium Download PDF

Info

Publication number
CN112307943A
CN112307943A CN202011178995.2A CN202011178995A CN112307943A CN 112307943 A CN112307943 A CN 112307943A CN 202011178995 A CN202011178995 A CN 202011178995A CN 112307943 A CN112307943 A CN 112307943A
Authority
CN
China
Prior art keywords
target
detection
moving target
moving
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011178995.2A
Other languages
Chinese (zh)
Other versions
CN112307943B (en
Inventor
张重阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Haitang Information Technology Co Ltd
Original Assignee
Ningbo Haitang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Haitang Information Technology Co Ltd filed Critical Ningbo Haitang Information Technology Co Ltd
Priority to CN202011178995.2A priority Critical patent/CN112307943B/en
Publication of CN112307943A publication Critical patent/CN112307943A/en
Application granted granted Critical
Publication of CN112307943B publication Critical patent/CN112307943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a system, a terminal and a medium for detecting a man ship target in a water area, wherein the method comprises the following steps: performing semantic segmentation on a water area in a video image to be detected, and filtering an irrelevant background area; detecting a moving target of the video image with the irrelevant background region filtered out by adopting a background modeling method, and detecting the moving target; carrying out modeling constraint on the detected moving target by utilizing a time rule of a target motion track and a space rule of a target size so as to judge; performing target detection on the moving target by using a target detector, wherein the target detector is obtained by adopting deep learning method training; and fusing the judgment result and the detection result of the target detector to obtain a final detection result. Aiming at the difficulties of large area, small target, low illumination and the like in the existing water area man-boat detection, the robustness of man-boat target detection is enhanced, the probability of false detection and missed detection is effectively reduced, and the improvement on the identification precision is obtained.

Description

Water area man-boat target detection method, system, terminal and medium
Technical Field
The invention relates to the field of target detection in images, in particular to a method and a system for detecting targets of water-area man ships, a corresponding terminal and a corresponding medium.
Background
The arrival of the big data era pushes the continuous updating and development of the computer technology, and the target detection technology as a research hotspot in the field of computer vision shows important application value in the fields of intelligent video monitoring, intelligent transportation and the like. The technology also provides a solution for the identification of events such as illegal fishing, fish stealing at night and the like based on the video security monitoring system.
The water surface passenger ship target detection and identification task is taken as an actual application branch of the target detection task, and has a difficult point and a characteristic: the difficulty is that the area is large, the target is small, the illumination is low, the outlines of personnel and ships are fuzzy, the visual characteristics are not obvious, and false detection or missing detection is easy to occur. Meanwhile, the number of target data samples is small, so that the depth model is not trained sufficiently and has insufficient precision; one feature of this task is that events occur in water, which has the advantage of a relatively clean background, but also has artifacts that can affect the target detection performance. A fast RCNN target detection network (advanced in neural information processing systems, 2015) adopts a two-stage network mechanism of firstly generating candidate frames and then classifying, the accuracy of the model is excellent due to two-time frame coordinate regression, meanwhile, a depth network method based on a priori frame is innovatively provided for realizing the generation of the candidate frames, and the speed level is greatly improved. The YOLO detection algorithm (Proceedings of the IEEE conference on computer vision and pattern recognition, 2016) converts the target detection task into a regression problem, takes the whole graph as input, and directly predicts the target type and frame on each divided grid, thereby remarkably improving the detection speed and achieving the real-time detection level. The method has the disadvantages that a candidate region mechanism is replaced by a mode of extracting a target window by grid division, the number of windows is greatly reduced, and the detection precision is not high.
Although the two traditional target detection models have strong universality and can be used for detection tasks of a plurality of scenes, the detection tasks aiming at small-scale targets are not satisfactory in performance, and the small-scale and fuzzy targets are difficult to effectively detect. On one hand, the special targets are derived from the special targets, the small scale means small information quantity, the small targets are easy to lose in feature extraction, meanwhile, the small targets are easy to be confused with the background, and the interference of the background is very easy to cause false detection of the targets; on the other hand, the two traditional model algorithms also derive from the defects of the models, the attention degree of the two traditional model algorithms to the feature information of the lower layer is not high, and the small target has a greater demand on the fine-grained feature of the lower layer, so that the role of the low-level detailed feature in target detection needs to be improved, and the detection capability of the models to the small-scale fuzzy target needs to be improved.
Through retrieval, the Chinese invention patent with application number 201911118442.5 discloses a ship detection method based on image identification, which comprises the following steps: A. extracting high-frequency and low-frequency Gabor characteristics and H, S, V color characteristics of the image, and establishing multi-resolution pyramid representation of multiple types of characteristics; B. carrying out central-peripheral operator operation on the multi-scale component images in the various characteristic pyramid forms to simulate the human eye receptive field characteristics to form a multi-scale characteristic diagram; C. carrying out cross-scale combination and normalization on the multi-scale feature maps of various features to form corresponding component saliency maps of color, brightness and direction; D. linearly fusing 3 saliency maps of different features into 1 total saliency map used for representing the saliency of each region in the image; E. and finally, obtaining the region of the ship target in the task water area image by adopting a histogram threshold segmentation method. The application technology uses a digital image processing technology to detect the position of a ship in real time in a task water area image, but the technical problem cannot be solved.
Disclosure of Invention
In view of the above-mentioned deficiencies of the prior art, the present invention provides a method and a system for detecting a target of a ship in a water area.
In one aspect of the present invention, a method for detecting a target of a man ship in a water area is provided, which includes:
performing semantic segmentation on a water area in a video image to be detected, and filtering an irrelevant background area;
detecting a moving target of the video image with the irrelevant background region filtered out by adopting a background modeling method, and detecting the moving target, wherein the moving target comprises a person and/or a ship;
carrying out modeling constraint on the detected moving target by utilizing a time rule of a target motion track and a space rule of a target size so as to judge;
performing target detection on the moving target by using a target detector, wherein the target detector is obtained by adopting deep learning method training, the deep learning is combined with deep network multi-level features, and the semantic features of small targets are reserved to the maximum extent;
and fusing the judgment result and the detection result of the target detector, and filtering the detected moving target which does not meet the requirement to obtain a final detection result.
Optionally, the semantically segmenting the water area in the video image to be detected, and filtering out an irrelevant background area includes:
based on a deep learning model network architecture, performing semantic segmentation on the flat water area image, and segmenting an irrelevant background region;
and filtering the irrelevant background area on the basis of completing segmentation on the water area in the image.
Optionally, the detecting a moving object of the video image with the irrelevant background region filtered out by using the background modeling method includes:
constructing a color distribution model of each pixel according to probability statistical information of each pixel in the video image in a time domain by using a background representation method based on pixel sample statistical information to realize background modeling;
and judging the foreground and the background in the video image based on the background modeling, and detecting the moving target.
Optionally, the performing modeling constraint on the detected moving target by using a temporal rule of a target motion trajectory and a spatial rule of a target size, so as to determine, includes:
calculating the offset distance of the moving target in the adjacent sampling frames according to the coordinates of the moving target in the adjacent sampling frames; screening out the mismatching condition according to the rule that the offset distance of the same retrieval target in adjacent sampling frames is the shortest; according to the principle of minimum motion corner of adjacent frames of the moving target, the rotating angle of the moving target represented by the adjacent sampling frames only slightly changes, the difference of the rotating angles is kept within a set range, and the moving target meeting the constraint is used as the result of time rule identification;
according to the aspect ratio distribution of the target in space and the size distribution which can be estimated along with the depth change, the length-width ratio and the depth-size relation modeling are carried out on the single detection target of the moving target, the moving target is restrained from the space level, and the result after the space rule identification is obtained.
Optionally, the performing target detection on the moving target by using a target detector includes:
using a characteristic pyramid network, fusing multi-level characteristics of a depth network, and maximally reserving semantic characteristics of small targets;
by setting smaller and denser candidate frame templates and using the depth information to set reasonable candidate frame sizes, the matching probability of the candidate frames and the target to be detected is improved;
a ship target detection network based on spatial context is designed by utilizing the spatial context information of the ship target, and the small target detection performance is improved by utilizing the perimeter scene.
Optionally, the fusing the result of the identification and the result detected by the target detector, and filtering out the detected moving target that does not meet the requirement includes:
and firstly obtaining time rules of the target motion track, space rules of the target size and result scores of the man-ship target detection methods, then performing weighted fusion, filtering out targets with total scores lower than an empirical threshold or scores of any one of the three methods lower than a corresponding threshold of the method, and finally taking the remaining targets as final detection results.
Optionally, before performing semantic segmentation on a water area in a video image to be detected, the method further includes: the video image to be detected is preprocessed to reduce uneven illumination and/or make the input image have proper size and higher image quality.
In a second aspect of the present invention, there is provided a system for detecting a target in a ship in a water area, comprising:
the semantic segmentation module is used for performing semantic segmentation on a water area in a video image to be detected and filtering an irrelevant background area;
the moving target detection module detects a moving target of the video image after the irrelevant background area is filtered out by the semantic segmentation module by adopting a background modeling method, and detects the moving target, wherein the moving target comprises a person and/or a ship;
the space-time rule identification module carries out modeling constraint on the moving target detected by the moving target detection module by utilizing a time rule of a target motion track and a space rule of a target size so as to identify;
the target detection module is used for carrying out target detection on the moving target obtained by the moving target detection module by adopting a target detector, wherein the target detector is obtained by adopting a deep learning method for training, the deep learning is combined with the multi-level characteristics of a deep network, and the semantic characteristics of the small target are reserved to the maximum extent;
and the fusion judgment module fuses the result obtained by the time-space rule identification module and the result detected by the target detector, filters the detected moving target which does not meet the requirement, and finally takes the remaining target as the final detection result.
In a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor is configured to execute any one of the above methods for detecting a target in a water area ship when executing the program.
In a fourth aspect of the invention, there is provided a computer readable storage medium having stored thereon a computer program for executing the method of detecting a target in a water craft as defined in any one of the preceding claims.
Compared with the prior art, the invention has the following beneficial effects:
according to the method and the system for detecting the man-boat target in the water area, firstly, the characteristic that the water area is flat is utilized to carry out semantic segmentation, irrelevant background areas which are easy to generate false detection are filtered, and the false detection is effectively reduced. The moving target detection is realized in a background modeling mode, which is beneficial to detecting small-scale and fuzzy man-ship targets, on the basis, the time domain rule, the airspace rule and the depth model detection are combined, the respective advantages of the three methods are integrated, the target fusion judgment is realized, the false detection rate is further effectively reduced, and the final warning output reliability is effectively guaranteed.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of a method for detecting a man-boat target in accordance with an embodiment of the present invention;
FIG. 2 is a diagram illustrating the effect of water area segmentation according to an embodiment of the present invention;
FIG. 3 is a flow chart of moving object detection in a preferred embodiment of the present invention;
FIG. 4 is a flow chart of spatiotemporal rule identification in a preferred embodiment of the present invention;
FIG. 5 is a flow chart of a method for detecting a human-vessel target in accordance with a preferred embodiment of the present invention;
FIG. 6 is a block diagram of a ship target detection system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
FIG. 1 is a flow chart of water craft target detection in an embodiment of the invention. Referring to fig. 1, the method for detecting the target of the ship in the water area in the embodiment may be performed according to the following steps:
s100, performing semantic segmentation on a water area in a video image to be detected, and filtering an irrelevant background area;
extraneous background areas typically include clouds, sky, water banks, onshore building structures, trees, and the like.
S200, detecting the moving object of the video image with the irrelevant background region filtered out by adopting a background modeling method, and detecting the moving object. The moving target can be a fishing boat, and can also be a fishing boat and a person.
And S300, carrying out modeling constraint on the detected moving target by utilizing a time rule of the target motion track and a space rule of the target size, so as to judge.
And S400, performing target detection on the moving target by using a target detector, wherein the target detector is obtained by training by using a deep learning method. The target detector can adopt a deep network, deep learning is combined with deep network multi-level features, and semantic features of small targets are reserved to the maximum extent.
And S500, fusing the judgment result and the detection result of the target detector, filtering the detected moving target which does not meet the set requirement, and taking the remaining target as a final detection result.
It should be understood that the two parts S300 and S400 in the above embodiments are not required to be implemented in a special order, and may be executed in any order or in parallel, and the executed result is used as the input of S500.
The difficulty of the underwater manned ship target detection and identification task is small target and low illumination, which causes fuzzy outlines and unobvious visual characteristics of personnel and ships. Meanwhile, the number of target data samples is small, so that the depth model is not trained sufficiently and has insufficient precision; the task is characterized in that the event occurs on a water area, has the advantage of relatively clean background, but has reflection at the same time and can also influence the target detection performance. Aiming at the problems, the embodiment of the invention firstly utilizes the characteristic of flat water area to carry out semantic segmentation, filters irrelevant background areas which are easy to generate false detection, and effectively reduces the false detection. The moving target detection is realized in a background modeling mode, the detection of small-scale and fuzzy man-ship targets is facilitated, on the basis, the advantages of three methods are combined by combining a time domain rule method, a space domain rule method and a depth model detection method, the respective advantages of the three methods are integrated, the target fusion judgment is realized, and the false detection rate is further effectively reduced.
In the above embodiment, the semantic segmentation is performed on the water area in the video image to be detected, the semantic segmentation can be performed on the flat water area image based on the deep learning model network architecture, background areas such as the water area and the coastal area are segmented, and the irrelevant background area is filtered out on the basis of completing the segmentation on the water area in the image. In the preferred embodiment, the deep learning technology is adopted to perform accurate and rapid image water area segmentation. For example, a MASK RCNN model (Proceedings of the IEEE international conference on computer vision, 2017) and a segmentation network architecture of dense depth separation convolution may be used, an aerial image such as an unmanned aerial vehicle is used as an input, high-dimensional features of the image are extracted through the dense separation convolution and the expansion convolution, and an upsampling decoding module based on bilinear interpolation is constructed for outputting a segmentation result. The traditional image water area segmentation algorithm is seriously influenced by speckle noise and image energy change, and the parameter adjustment needs too much manual participation, so that the accurate water area segmentation under the conditions of complex environment and various interferences is difficult to realize. FIG. 2 is a diagram illustrating the effect of semantic segmentation of a water area according to a preferred embodiment of the present invention. Compared with the prior water area segmentation processing data set, the embodiment has the advantages of greatly improving the segmentation accuracy, having obvious advantages in robustness and segmentation speed and having better engineering practical value.
FIG. 3 is a flow chart of moving object detection in a preferred embodiment of the present invention. Referring to fig. 3, in the preferred embodiment, for realizing moving target detection based on background modeling, the following steps can be adopted for detecting the moving targets of people and fishing boats:
s201, a background representation method based on pixel sample statistical information is used for constructing a color distribution model of each pixel according to probability statistical information (mean/variance) of each pixel in a video in a time domain, so that the purpose of background modeling is achieved.
S202, the foreground and the background in the image can be judged based on background modeling, and therefore the man-boat moving target is detected.
Specifically, the foreground and the background in the image can be judged based on the background modeling, the characteristics of each pixel point in the image can be represented by using a plurality of gaussian models, the gaussian model is updated after the image of a new frame is obtained, each pixel point in the current image is matched with the gaussian model, if the matching is successful, the point is judged to be the foreground point, and if the matching is not successful, the point is judged to be the background point. In addition, considering the interference of water ripple fluctuation, light and shadow change and the like in a water area, the foreground image can be further processed by adopting corrosion and expansion technologies, so that noise factors such as shadow, disturbance and the like are eliminated, the quality of the foreground image is improved, and the accuracy of moving target detection is improved.
FIG. 4 is a flow chart of spatiotemporal rule identification in a preferred embodiment of the present invention. Referring to fig. 4, in the spatio-temporal rule identification, the moving target is identified by modeling and constraining using a temporal rule of the target motion trajectory and a spatial rule of the target size. Specifically, the following method can be adopted:
s301, according to the coordinates of the moving target in the adjacent sampling frames, the offset distance of the moving target in the adjacent sampling frames can be calculated.
S302, according to the aspect ratio distribution of the targets in space and the size distribution which can be estimated along with the depth change, modeling is carried out on the aspect ratio and the depth-size relation of a single detection target, namely a man-boat, and the moving target is restrained from the space level.
In the above embodiment, the time rule refers to a principle that the offset distance of adjacent sampling frames is the shortest, and the offset distance of the moving object in the adjacent sampling frames can be calculated according to the coordinates of the moving object in the adjacent sampling frames. According to the rule that the offset distance of the same retrieval target in adjacent sampling frames is the shortest, the mismatching condition can be effectively eliminated; according to the principle of minimum motion corner of adjacent frames of the moving target, because the motion of the object meets the consistency of the track, the rotation angle of the moving target represented by the adjacent sampling frames only slightly changes, namely, the rotation angle difference is kept within a certain range. And establishing a control strategy according to the point to reduce the situation of mismatching.
In the above embodiment, the spatial rule indicates that the detected target has a single type, so that the target presents a relatively uniform aspect ratio distribution in space and a size distribution that can be estimated along with depth change, and the false detection result of the moving target can be effectively removed and the detection performance can be improved by modeling according to the aspect ratio and the depth-size relationship.
In the above embodiments of the present invention, the target detector is used to perform target detection on the moving target, and the target detector is a man-boat target detector based on a deep learning method. In a preferred embodiment:
1. the target detector can use a feature pyramid network, integrates multi-level features of a deep network, and maximally retains semantic features of small targets. By adopting the network structure of the characteristic pyramid, the multi-scale characteristics can be fully utilized for identification.
2. And setting smaller and denser candidate frame templates aiming at the feature pyramid network, and setting reasonable candidate frame sizes by using the depth information, so that the matching probability of the candidate frames and the target to be detected is improved.
3. A ship target detection network based on spatial context is designed by utilizing the spatial context information of the ship target, and the small target detection performance is improved by utilizing the perimeter scene.
Specifically, the spatial context information may specifically be four neighborhood spaces, namely, an upper neighborhood space, a lower neighborhood space, a left neighborhood space, a right neighborhood space, and a left neighborhood space, of the target candidate frame, where the size of each neighborhood space specifically is: if the length of the candidate frame is larger than or equal to the width, the size and the length and the width of the upper neighborhood and the lower neighborhood are consistent with those of the target candidate frame, and meanwhile, the left neighborhood and the right neighborhood are set to be square areas with the side length equal to the width of the candidate frame; if the length of the candidate frame is less than or equal to the width, the upper neighborhood and the lower neighborhood are set to be square areas with the side length equal to the length of the candidate frame, and the size, the length and the width of the left neighborhood and the right neighborhood are kept consistent with those of the target candidate frame.
The man-ship target detection network based on the space context is characterized in that space context information, namely CNN features of four space neighborhoods, are combined into a context feature on a CNN feature diagram of an image according to the sequence of top, bottom and left, the context feature is used as an input feature of a Gated Recurrent Unit (GRU), the CNN features of candidate frames are used as hidden state input of the GRU, and the output of the GRU is sent to a classification and regression network for classification and regression.
4. Setting a multi-scale training method aiming at the characteristic pyramid network, wherein firstly, when the scale of an object is close to that of a pre-training data set, the object is used as a training sample of a detector, and secondly, in the training, only the candidate frame gradients with the sizes within a pre-specified range are returned each time; because the number of the training samples of the man-boat is less, a large number of samples are generated by methods of manual image turning, generation of a countermeasure network and the like so as to train a more robust man-boat target detector.
Of course, the above-mentioned 4-point preferred modes can be used alone or in any combination, and when a plurality of modes are used in combination, the effect is better.
In the above embodiment of the present invention, the result of the identification and the result detected by the target detector are fused, specifically, all the results of the time rule, the space rule and the target detector are used, and the three results are considered comprehensively by a weighted fusion manner, so that the targets with the total score lower than the empirical threshold or any one of the results with the score lower than the empirical threshold are screened out, so as to reduce the false detection rate. The threshold may be an empirical threshold, obtained experimentally.
In the multi-detector information fusion, because the data returned by each detector represents different characteristic information of the target, multi-level and multi-step analysis fusion needs to be carried out on multi-source information. According to the abstraction degree of multi-detector data during fusion, the fusion level can be divided into two levels: a feature level and a decision level. The characteristic level fusion carries out cascade fusion on the characteristic information extracted by the moving target detector and the deep learning detector, and then carries out classification decision judgment, and the fusion of the level gives consideration to all aspects of information loss and anti-interference capability. And the decision-level fusion is to perform preprocessing and feature extraction on target data by each detector to obtain respective decision results, and then perform overall weighted fusion on the results. The fusion advantages of the two levels are complementary, and the detection precision can be effectively improved.
FIG. 5 is a flow chart of a target detection method for a man-boat in a preferred embodiment of the invention. Referring to fig. 5, in the preferred embodiment, the method for detecting the target of the ship in the water area may include the following steps:
firstly, preprocessing is performed on an original video image to be detected, and in the embodiment, the preprocessing may be one or more of preprocessing such as scale scaling, denoising and Gamma correction according to needs, so as to reduce the influence of uneven illumination.
Secondly, water area segmentation, namely semantic segmentation is carried out on the water area in the video image processed in the first step, and irrelevant background areas which are easy to generate false detection are filtered;
thirdly, on the basis of the second step, a background modeling method is adopted to realize moving target detection, and the small-scale man-boat moving target is effectively detected;
fourthly, judging a space-time rule, namely utilizing a time rule of a target motion track and a space rule of a target size to carry out modeling constraint on the moving target detected in the third step so as to judge;
fifthly, a deep learning method is utilized, multi-layer characteristic information is fused, a robust target detector aiming at the ship target is trained, and the ship target is detected by the target detector;
and sixthly, integrating the judgment result of the time-space rule in the fourth step and the depth detection result of the target detector in the fifth step, filtering the detected moving target which does not meet the requirement, and finally, taking the remaining moving target as warning output.
According to the embodiment of the invention, the original video image is preprocessed, so that the problems of uneven illumination and the like are reduced, and then the preprocessed video image is subjected to water area segmentation, so that irrelevant background areas which are easy to generate false detection are filtered; by the background modeling method, small-scale human and fishing boat targets can be detected, and the problems of fuzzy outlines and unobvious visual features of personnel and boats caused by small targets and low illumination in the existing underwater human and boat target detection and identification task are solved. Then, by a method combining time-space rule identification and depth detection, the judgment is fused, the detected moving target is filtered, and finally warning output is carried out, so that the water-area man-boat target detection is realized, the performance and the accuracy of target detection are improved, and meanwhile, the misjudgment rate is reduced.
FIG. 6 is a block diagram of a ship target detection system according to an embodiment of the present invention. In another embodiment of the present invention, a system for detecting a target of a water area ship is further provided, which is used for implementing the method for detecting a target of a water area ship. Specifically, referring to fig. 6, the system includes: the system comprises a semantic segmentation module, a moving target detection module, a time-space rule identification module, a target detection module and a fusion judgment module, wherein: the semantic segmentation module performs semantic segmentation on a water area in a video image to be detected and filters an irrelevant background area; the moving target detection module detects a moving target of the video image after the irrelevant background area is filtered by the semantic segmentation module by adopting a background modeling method, and detects the moving target, wherein the moving target comprises a person and/or a ship; the time-space rule identification module utilizes the time rule of the target motion track and the space rule of the target size to carry out modeling constraint on the moving target detected by the moving target detection module so as to identify; the target detection module adopts a target detector to perform target detection on the moving target obtained by the moving target detection module, wherein the target detector is obtained by adopting a deep learning method for training, deep learning is combined with deep network multi-level features, and the semantic features of small targets are reserved to the maximum extent; and the fusion judgment module fuses the result obtained by the time-space rule identification module and the result detected by the target detector, filters the detected moving target which does not meet the requirement, and finally takes the remaining target as the final detection result.
The man-boat target detection system in the embodiment of the invention reasonably deals with the technical difficulties of large area, small target, low illumination and the like of the water surface man-boat target detection and identification task, effectively removes the interference of the background, successfully combines the advantages of the traditional methods such as background modeling and the like and the target detection depth method, enhances the robustness of man-boat target detection, effectively reduces the probability of false detection and missed detection, and obtains the improvement of the identification precision.
In another preferred embodiment, the man-boat target detection system may further include a preprocessing module, which reads in an original video image, preprocesses the image as needed, segments background areas such as a water area and a coastal area, and filters out an irrelevant background area on the basis of completing segmentation of the water area in the image. The preprocessing can select one or more of the preprocessing such as scale scaling, denoising and Gamma correction according to the requirement, so that the input image has proper size and better quality, and is convenient for subsequent processing.
The specific implementation technology of each module in the embodiment of the man-boat target detection system corresponds to each step in the man-boat target detection method, and is not described herein again.
The above preferred features of the embodiments can be used alone in any embodiment, or in any combination thereof without conflict. In addition, portions which are not described in detail in the embodiments may be implemented by using the prior art.
In another embodiment of the present invention, there is also provided a terminal, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to perform the method for detecting a target of a human or vessel in any of the above embodiments. The terminal in this embodiment may be a computer, a smart phone, or any other terminal with processing capability.
In another embodiment of the present invention, there is also provided a computer-readable storage medium having stored thereon a computer program for performing the method of detecting a target of a human vessel in any of the above embodiments when the program is executed by a processor.
The embodiment of the invention is designed and realized aiming at the special task of detecting and identifying the target of the underwater manned ship, reasonably solves the technical difficulties of large area, small target, low illumination and the like of the detection task, effectively combines the achievements in the field of image target detection, and reduces the rate of missing report and false report as far as possible while ensuring high identification precision.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described herein.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims (10)

1. A method for detecting a target of a man ship in a water area is characterized by comprising the following steps:
performing semantic segmentation on a water area in a video image to be detected, and filtering an irrelevant background area;
detecting a moving target of the video image with the irrelevant background region filtered out by adopting a background modeling method, and detecting the moving target, wherein the moving target comprises a person and/or a ship;
carrying out modeling constraint on the detected moving target by utilizing a time rule of a target motion track and a space rule of a target size so as to judge;
performing target detection on the moving target by using a target detector, wherein the target detector is obtained by adopting deep learning method training, the deep learning is combined with deep network multi-level features, and the semantic features of small targets are reserved to the maximum extent;
and fusing the judgment result and the detection result of the target detector, and filtering the detected moving target which does not meet the requirement to obtain a final detection result.
2. The method for detecting the targets of the manned vessels in the water area according to claim 1, wherein the semantic segmentation is performed on the water area in the video image to be detected, and the irrelevant background area is filtered out, and the method comprises the following steps:
based on a deep learning model network architecture, performing semantic segmentation on the flat water area image, and segmenting an irrelevant background region;
and filtering the irrelevant background area on the basis of completing segmentation on the water area in the image.
3. The method for detecting the targets of the ships in the water area according to claim 1, wherein the moving target detection of the video images with the extraneous background areas filtered out by the background modeling method comprises:
constructing a color distribution model of each pixel according to probability statistical information of each pixel in the video image in a time domain by using a background representation method based on pixel sample statistical information to realize background modeling;
and judging the foreground and the background in the video image based on the background modeling, and detecting the moving target.
4. The method for detecting targets of water crafts as claimed in claim 1 wherein said determining by modeling constraints on said moving targets detected by temporal rules of target motion trajectory and spatial rules of target size comprises:
calculating the offset distance of the moving target in the adjacent sampling frames according to the coordinates of the moving target in the adjacent sampling frames; screening out the mismatching condition according to the rule that the offset distance of the same retrieval target in adjacent sampling frames is the shortest; according to the principle of minimum motion corner of adjacent frames of the moving target, the rotating angle of the moving target represented by the adjacent sampling frames only slightly changes, the difference of the rotating angles is kept within a set range, and the moving target meeting the constraint is used as the result of time rule identification;
according to the aspect ratio distribution of the target in space and the size distribution which can be estimated along with the depth change, the length-width ratio and the depth-size relation modeling are carried out on the single detection target of the moving target, the moving target is restrained from the space level, and the result after the space rule identification is obtained.
5. The method for detecting targets in a water craft as claimed in claim 1 wherein said detecting targets in said moving targets with target detectors comprises:
using a characteristic pyramid network, fusing multi-level characteristics of a depth network, and maximally reserving semantic characteristics of small targets;
by setting smaller and denser candidate frame templates and using the depth information to set reasonable candidate frame sizes, the matching probability of the candidate frames and the target to be detected is improved;
a ship target detection network based on spatial context is designed by utilizing the spatial context information of the ship target, and the small target detection performance is improved by utilizing the perimeter scene.
6. The method for detecting targets in a water craft as claimed in claim 1, wherein said fusing the result of said identification and the result of said target detection by said target detector, and filtering out the detected unsatisfactory moving targets, comprises:
and firstly obtaining time rules of the target motion track, space rules of the target size and result scores of the man-ship target detection methods, then performing weighted fusion, filtering out targets with total scores lower than an empirical threshold or scores of any one of the three methods lower than a corresponding threshold of the method, and finally taking the remaining targets as final detection results.
7. The method for detecting the targets of the people and ships in the water area as claimed in any one of claims 1 to 6, wherein before the semantic segmentation of the water area in the video image to be detected, the method further comprises: the video image to be detected is preprocessed to reduce uneven illumination and/or make the input image appropriate in size and higher in quality.
8. A water craft target detection system, comprising:
the semantic segmentation module is used for performing semantic segmentation on a water area in a video image to be detected and filtering an irrelevant background area;
the moving target detection module detects a moving target of the video image after the irrelevant background area is filtered out by the semantic segmentation module by adopting a background modeling method, and detects the moving target, wherein the moving target comprises a person and/or a ship;
the space-time rule identification module carries out modeling constraint on the moving target detected by the moving target detection module by utilizing a time rule of a target motion track and a space rule of a target size so as to identify;
the target detection module is used for carrying out target detection on the moving target obtained by the moving target detection module by adopting a target detector, wherein the target detector is obtained by adopting a deep learning method for training, the deep learning is combined with the multi-level characteristics of a deep network, and the semantic characteristics of the small target are reserved to the maximum extent;
and the fusion judgment module fuses the result obtained by the time-space rule identification module and the result detected by the target detector, filters the detected moving target which does not meet the requirement, and finally takes the remaining target as the final detection result.
9. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, is adapted to perform the method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1-9.
CN202011178995.2A 2020-10-29 2020-10-29 Water area man-boat target detection method, system, terminal and medium Active CN112307943B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011178995.2A CN112307943B (en) 2020-10-29 2020-10-29 Water area man-boat target detection method, system, terminal and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011178995.2A CN112307943B (en) 2020-10-29 2020-10-29 Water area man-boat target detection method, system, terminal and medium

Publications (2)

Publication Number Publication Date
CN112307943A true CN112307943A (en) 2021-02-02
CN112307943B CN112307943B (en) 2022-06-03

Family

ID=74330757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011178995.2A Active CN112307943B (en) 2020-10-29 2020-10-29 Water area man-boat target detection method, system, terminal and medium

Country Status (1)

Country Link
CN (1) CN112307943B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095183A (en) * 2021-03-31 2021-07-09 西北工业大学 Micro-expression detection method based on deep neural network
CN114993262A (en) * 2022-04-20 2022-09-02 北京航空航天大学 Sea surface unmanned ship target identification method simulating reconstruction of prey bird receptive field area

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102905200A (en) * 2012-08-07 2013-01-30 上海交通大学 Video interesting region double-stream encoding and transmitting method and system
WO2019132587A1 (en) * 2017-12-29 2019-07-04 (주)제이엘케이인스펙션 Image analysis device and method
CN110020653A (en) * 2019-03-06 2019-07-16 平安科技(深圳)有限公司 Image, semantic dividing method, device and computer readable storage medium
CN111027399A (en) * 2019-11-14 2020-04-17 武汉兴图新科电子股份有限公司 Remote sensing image surface submarine identification method based on deep learning
CN111476089A (en) * 2020-03-04 2020-07-31 上海交通大学 Pedestrian detection method, system and terminal based on multi-mode information fusion in image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102905200A (en) * 2012-08-07 2013-01-30 上海交通大学 Video interesting region double-stream encoding and transmitting method and system
WO2019132587A1 (en) * 2017-12-29 2019-07-04 (주)제이엘케이인스펙션 Image analysis device and method
CN110020653A (en) * 2019-03-06 2019-07-16 平安科技(深圳)有限公司 Image, semantic dividing method, device and computer readable storage medium
CN111027399A (en) * 2019-11-14 2020-04-17 武汉兴图新科电子股份有限公司 Remote sensing image surface submarine identification method based on deep learning
CN111476089A (en) * 2020-03-04 2020-07-31 上海交通大学 Pedestrian detection method, system and terminal based on multi-mode information fusion in image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李雪等: "基于深度学习的海水浴场人员检测", 《信息技术与信息化》, no. 02, 28 February 2020 (2020-02-28) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095183A (en) * 2021-03-31 2021-07-09 西北工业大学 Micro-expression detection method based on deep neural network
CN114993262A (en) * 2022-04-20 2022-09-02 北京航空航天大学 Sea surface unmanned ship target identification method simulating reconstruction of prey bird receptive field area

Also Published As

Publication number Publication date
CN112307943B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
Fefilatyev et al. Detection and tracking of ships in open sea with rapidly moving buoy-mounted camera system
Tian et al. Rear-view vehicle detection and tracking by combining multiple parts for complex urban surveillance
Foedisch et al. Adaptive real-time road detection using neural networks
US8571261B2 (en) System and method for motion detection in a surveillance video
CN107133973B (en) Ship detection method in bridge collision avoidance system
CN109409283A (en) A kind of method, system and the storage medium of surface vessel tracking and monitoring
KR101731243B1 (en) A video surveillance apparatus for identification and tracking multiple moving objects with similar colors and method thereof
CN109919026B (en) Surface unmanned ship local path planning method
CN112307943B (en) Water area man-boat target detection method, system, terminal and medium
CN108776974A (en) A kind of real-time modeling method method suitable for public transport scene
T'Jampens et al. Automatic detection, tracking and counting of birds in marine video content
CN115346155A (en) Ship image track extraction method for visual feature discontinuous interference
Ghahremannezhad et al. Automatic road detection in traffic videos
CN114332163A (en) High-altitude parabolic detection method and system based on semantic segmentation
Shi et al. Obstacle type recognition in visual images via dilated convolutional neural network for unmanned surface vehicles
Balisavira et al. Real-time object detection by road plane segmentation technique for ADAS
Bloisi et al. Integrated visual information for maritime surveillance
CN111275733A (en) Method for realizing rapid tracking processing of multiple ships based on deep learning target detection technology
Shan et al. LMD-TShip⋆: vision based large-scale maritime ship tracking benchmark for autonomous navigation applications
Chen et al. Moving ship detection algorithm based on gaussian mixture model
Sadhu et al. Obstacle detection for image-guided surface water navigation
Liu et al. Pedestrian detection method based on self-learning
Pang et al. Modelling land water composition scene for maritime traffic surveillance
Jebelli et al. Efficient robot vision system for underwater object tracking
Ghahremannezhad et al. Illumination-aware image segmentation for real-time moving cast shadow suppression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant