CN115641767A

CN115641767A - Unmanned ship perception experiment platform device

Info

Publication number: CN115641767A
Application number: CN202211123940.0A
Authority: CN
Inventors: 刘星; 杨冰; 霍清华; 李建益; 冯卡力
Original assignee: Naval Sergeant School Of Chinese Pla
Current assignee: Naval Sergeant School Of Chinese Pla
Priority date: 2022-09-02
Filing date: 2022-09-15
Publication date: 2023-01-24
Also published as: CN115527104A; CN115527103A

Abstract

The invention provides an unmanned ship perception experiment platform device which comprises a teaching platform structure, a display component and an operation component, wherein the teaching platform structure comprises a server, a display component and an operation component, the server is provided with a target detection experiment software system, and the display component and the operation component are connected with the server; the pool environment structure is used for simulating a natural water body environment; the data acquisition structure is used for acquiring natural water body environment information simulated by the pool environment structure and transmitting the information to the server; through the construction of the device, students can conveniently master knowledge, the extension of teaching is facilitated, and the device is more convenient; the device is used for teaching, meanwhile, various detection technologies for detecting the unmanned surface vehicle are conveniently researched, the device is more widely applied, and the phase change and cost reduction are realized.

Description

Unmanned ship perception experiment platform device

Technical Field

The invention relates to an unmanned ship perception experiment platform device.

Background

An Unmanned Surface Vessel (USV) on the water Surface is another important Unmanned platform following an Unmanned aerial vehicle, an Unmanned land vehicle and an Unmanned underwater vehicle. Compared with unmanned platforms such as unmanned aerial vehicles, unmanned underwater vehicles, mobile robots and unmanned land vehicles, the unmanned surface vehicle is recognized as one of important means for performing water surface operations in future wars. The unmanned ship has the advantages of small volume, high speed, intellectualization and the like, and the research and the development and the manufacture of the unmanned ship have important significance on the aspects of the development and the utilization of ocean resources, the maintenance of ocean rights and interests, the guarantee of marine navigation safety, the enhancement of national influence and the like.

In order to better develop and develop related technologies, how to teach students to learn related knowledge is the most important matter for long-term development, but the current teaching mostly stays in textbooks and multimedia presentations, and a teaching system capable of being tested and displayed along with the classroom is lacked.

Disclosure of Invention

The invention aims to provide an unmanned boat perception experiment platform device.

Unmanned ship perception experiment platform device comprises

The teaching platform structure comprises a server provided with a target detection experiment software system, a display component and an operation component, wherein the display component and the operation component are connected with the server;

the pool environment structure is used for simulating a natural water body environment;

and the data acquisition structure is used for acquiring the natural water body environment information simulated by the pool environment structure and conveying the information to the server.

Further, the pool environment structure comprises a pool body, an environment interference part and a water surface target; the environment interference part is arranged on the four walls of the tank body or the periphery of the tank body, and comprises a wave maker and/or a mist maker; the water surface target can float on the water surface in the pool body.

Further, the data acquisition structure includes the mounting bracket, with visual sensor and/or the range finding sensor of connection can be dismantled to the mounting bracket.

Further, the target detection experiment software system comprises an image processing module and a video processing module; the image processing module comprises a noise simulation submodule, an image preprocessing submodule, an image segmentation submodule, an image smoothing submodule, an image sharpening submodule, an image geometric transformation submodule, an image arithmetic transformation submodule, an image logical operation submodule, a morphology submodule and a frequency domain analysis submodule.

Further, the image preprocessing sub-module comprises a color image graying function, a displayed image histogram function, a histogram equalization function, a brightness adjustment function, a contrast adjustment function, a saturation adjustment function, an image pseudo-color function, an image defogging function, an image rain removal function and a Hough line detection function.

Further, the color image graying function includes a single component grayscale method, a maximum grayscale method, an average grayscale method, and a weighted average grayscale method;

further, the graying formula of the single component grayscale method is as follows:

f(x1,y1)＝R(x1,y1)

or f (x 1, y 1) = G (x 1, y 1)

Or f (x 1, y 1) = B (x 1, y 1)

Where f (x 1, y 1) is a pixel value of the grayed grayscale image at a position (x 1, y 1), and R (x 1, y 1), G (x 1, y 1), and B (x 1, y 1) respectively represent values of the three components.

Further, the graying formula of the maximum grayscale method is as follows:

f(x2,y2)＝max(R(x2,y2),G(x2,y2),B(x2,y2))

where f (x 2, y 2) is a pixel value of the grayed grayscale image at a position (x 2, y 2), and R (x 2, y 2), G (x 2, y 2), and B (x 2, y 2) respectively represent values of the three components.

Further, the graying formula of the average value grayscale method is as follows:

f(x3,y3)＝(R(x3,y3)+G(x3,y3)+B(x3,y3))/3

where f (x 3, y 3) is a pixel value of the grayed grayscale image at a position (x 3, y 3), and R (x 3, y 3), G (x 3, y 3), and B (x 3, y 3) respectively represent an average value of the three components.

Further, the graying formula of the weighted average grayscale method is as follows:

f(x4,y4)＝0.3R(x4,y4)+0.59G(x4,y4)+0.11B(x4,y4)

where f (x 4, y 4) is a pixel value of the grayed grayscale image at a position (x 4, y 4), and R (x 4, y 4), G (x 4, y 4), and B (x 4, y 4) respectively represent values of the three components.

The application has the advantages that: through the construction of the device, a teacher can conveniently explain relevant knowledge such as water surface target detection, image processing and the like in a field experiment mode, so that the student can conveniently and rapidly master the knowledge, the experiment can be conveniently carried out according to actual conditions and temporary concepts, the extension of teaching is facilitated, and the device is more convenient; the device is used for teaching, meanwhile, various detection technologies for detecting the unmanned surface vehicle are conveniently researched, the device is more widely applied, and the phase change and cost reduction are realized.

Drawings

FIG. 1 is a system software host interface of the present invention;

FIG. 2 is a parameter setting module host interface of the present invention;

FIG. 3 is an image pre-processing sub-menu of the present invention;

FIG. 4 is an image segmentation submenu of the present invention;

FIG. 5 is an image smoothing submenu of the present invention;

FIG. 6 is an image sharpening submenu of the present invention;

FIG. 7 is an image geometry change sub-menu of the present invention;

FIG. 8 is a schematic view of a target detection result interface according to the present invention;

FIG. 9 is an original image of an input image pre-processing sub-module according to the present invention;

FIG. 10 is an image of the present invention with salt and pepper noise added;

FIG. 11 is an image of the present invention with Gaussian noise added;

FIG. 12 is a graph of the median filtering smoothing results of the present invention;

FIG. 13 is a fog-containing image that requires processing in the present invention;

FIG. 14 is an image of the present invention after defogging of the image including fog;

FIG. 15 is a schematic view of the teaching platform structure, pool environment structure and data collection structure of the present invention;

FIG. 16 is a schematic layout view of a teaching platform structure, a pool environment structure and a data acquisition structure according to the present invention;

FIG. 17 is a schematic diagram of a single Gaussian background modeling method according to the present invention;

FIG. 18 is a block diagram of the process of the Gaussian mixture background model of the present invention;

FIG. 19 is a flow chart of a method for detecting the unmanned surface vehicle sensing system according to the present invention;

fig. 20 is a flowchart of an initial background image acquisition method according to the present invention.

Detailed Description

In order to make the technical solution of the present invention better understood, the technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

As shown in fig. 1-20, an unmanned boat perception experiment platform system comprises a teaching platform structure, a pool environment structure and a data acquisition structure. The teaching platform structure comprises a server, a display component and an operation component, wherein the server is provided with a target detection experiment software system, and the display component and the operation component are connected with the server; the pool environment structure can be used for simulating natural water body environment, including but not limited to ocean, lake and river; the data acquisition structure is used for acquiring natural water body environment information simulated by the pool environment structure, and transmitting the acquired information to the server, so that the server processes the acquired information.

Specifically, the pool environment structure shown in fig. 15-16 includes a pool body, an environment interference part, and a water surface target; in this embodiment, the water tank is made of transparent acrylic material, the length is about 3m, the width is about 2m, the height is about 0.4m, and the water containing volume is about 2.4m3, and in other embodiments, the water tank can be made of other materials or made into other sizes according to the needs; the environment interference component comprises a wave maker and/or a fog maker and is used for simulating interference environments such as waves, water fog and the like in the natural water body environment, so that the data acquisition capacity and effect of the data acquisition structure are verified and displayed, the processing capacity and effect of the target detection experiment software system on the acquired data are verified and displayed, and the processing capacity and effect of a subsequently developed processing algorithm can be verified and compared. The environmental interference parts are arranged on the four walls of the tank body or the periphery of the tank body, in the embodiment, the mist generator adopts an industrial-grade ultrasonic humidifying sprayer, the wave generator adopts an ultra-silent circulation variable frequency pump, the regulation of ten wave generation modes such as a constant flow mode, a clearance mode, a turbulent flow mode, a tide mode and the like is realized, and one or more than one mode can be installed according to the area of the tank body. The water surface target can float on the water surface in the pool body, and in the embodiment, the water surface target adopts a group of ship models with different sizes to simulate the water surface target.

Specifically, the data acquisition structure includes mounting bracket and sensor, and the sensor includes vision sensor and range finding sensor in this embodiment, and wherein, vision sensor plans to adopt the hot imaging pair spectrum cloud platform camera DS-2TD5136/DS-2TD5137 series product of sea conway vision low-power consumption, and laser range finding sensor plans to adopt and takes a step to survey L2 laser range finding sensor. The sensor is for dismantling the link with the mounting bracket, can newly-increased (or replace) target detection sensor (like sensor such as binocular vision, ultrasonic wave, laser radar), facilitates for teaching and research of the new target detection means of follow-up extension.

Specifically, the target detection experiment software system comprises a user management module, a parameter setting module, an image processing module and a video processing module; the user management module is used for setting the authority of a user account, and setting different account authorities according to the requirements of different users (such as common accounts suitable for students and administrator accounts suitable for instructors); the parameter setting module is used for editing parameters of a system algorithm, such as translation distance, rotation angle, morphological erosion size and the like of an image (as shown in fig. 2), the parameters can be adjusted according to different target conditions of each analysis, so that the image or video processing effect is better, different processing results can be obtained by adjusting different parameters, students can learn to understand the influence conditions of different parameters on subsequent processing conveniently, and the unmanned surface vehicle detection technology or algorithm can be conveniently researched subsequently.

Specifically, the image processing module comprises a noise simulation sub-module, an image preprocessing sub-module, an image segmentation sub-module, an image smoothing sub-module, an image sharpening sub-module, an image geometric transformation sub-module, an image arithmetic transformation sub-module, an image logical operation sub-module, a morphology sub-module and a frequency domain analysis sub-module.

In some embodiments, the noise simulation submodule comprises a salt and pepper noise adding function and a gaussian noise adding function, and the image smoothing submodule comprises a mean filtering function, a median filtering function and a bilateral filtering function; the noise adding function is used for artificially adding a noise signal with certain intensity to a target image and confusing original partial image information; therefore, students can recognize the image states and characteristics obtained after receiving noise interference, the students can learn and remember conveniently, the filtering processing can be carried out on the images subjected to different noise interference by utilizing different filtering functions in the follow-up process, the students can learn and learn the images subjected to different noise interference conveniently, the image states and characteristics after the filtering processing are carried out through different filtering functions are convenient to master the characteristics of different filtering functions, the students can learn and apply different filtering functions conveniently, and the anti-noise capability of research results can be verified when relevant technologies such as unmanned surface vehicle detection technology and algorithm are researched in the follow-up process, and the comparison with the existing filtering function is convenient.

Specifically, the salt and pepper noise may be caused by the sudden strong interference of the image signal, the analog-to-digital converter or the bit transmission error. For example, a failed sensor results in a minimum pixel value, a saturated sensor results in a maximum pixel value; artificially adding salt and pepper noise generally conforms to the following probability density function:

the salt and pepper noise randomly generates pixel positions in some images according to the signal-to-noise ratio of the images, and assigns values of 0 or 255 to the pixel points randomly.

Specifically, in communication channel testing and modeling, gaussian noise is used as additive white noise to produce additive white gaussian noise (if a noise whose amplitude distribution follows a gaussian distribution and whose power spectral density is uniformly distributed, it is called gaussian white noise), which is used as additive white noise to produce additive white gaussian noise, and artificially added gaussian noise generally conforms to the following probability density function:

where σ is the standard deviation of z and μ is the mean. The Gaussian noise is additive noise, and the noise which accords with the probability density function is added on the basis of the original image to obtain a noise-added image.

Specifically, in the detection, a remote sensing technology is generally used to obtain a target image, and due to the influence of factors such as atmosphere on the sensor, some areas with excessively large brightness change or some bright spots (also called noise) may appear on the remote sensing image, so that a filtering function is needed to suppress noise, and the brightness of the image tends to be flat.

Specifically, the mean filtering function may calculate the gray level of the filtered image according to the following formula:

where, (x, y) is the target pixel coordinate, and m x n is the window size containing the pixels near the target pixel.

That is, a target pixel is specified on the image to be processed;

setting a window by taking the target pixel as a central point, wherein the window comprises pixel coordinates in a rectangular window with the coordinate point (x, y) of the target pixel as the central point and the size of m x n;

the gray value of the center point is calculated (with the mean of all pixels in the window as the gray value of the center point).

Specifically, the median filtering function output is:

g(x,y)＝med{f(x-k,y-l),(k,l)∈w}

where f (x, y), g (x, y) are the original image and the processed image, respectively, and w is a two-dimensional template, typically 3 × 3,5 × 5 regions, and may also be different shapes, such as lines, circles, crosses, circles, and the like.

Specifically, the bilateral filtering function satisfies the following formula:

wherein the content of the first and second substances,

is the result of filtering, I _q Is the input image of the image to be displayed,

is a normalized weight parameter that is a function of,

and

respectively a spatial distance weight and a gray scale distance weight, sigma _s And σ _r The filter window sizes of the two weights are respectively;

the above principle is that a gaussian function related to a spatial distance (referring to an euclidean distance between a current point and a center point) is multiplied by a gaussian function related to a gray distance (referring to an absolute value of a difference between a gray level of the current point and a gray level of the center point);

the spatial domain gaussian function has the mathematical form:

wherein (x) _i ,y _i ) Is the current point position, (x) _c ,y _c ) The position of the central point is, and sigma is a space domain standard deviation;

the value domain gaussian function is mathematically formed as:

wherein gray (x) _i ,y _i ) Gray (x) for the current point gray value _c ,y _c ) Is the center point gray value.

In some embodiments, the image preprocessing sub-module includes a color image graying function, a display image histogram function, a histogram equalization function, a brightness adjustment function, a contrast adjustment function, a saturation adjustment function, an image pseudo-color function, an image defogging function, an image rain removal function, and a hough line detection function.

Specifically, the color image graying function includes a single component grayscale method, a maximum grayscale method, an average grayscale method, and a weighted average grayscale method; the gray scale formula of the single-component gray scale method is as follows:

f(x1,y1)＝R(x1,y1)

or f (x 1, y 1) = G (x 1, y 1)

Or f (x 1, y 1) = B (x 1, y 1)

Wherein f (x 1, y 1) is a pixel value of the grayed grayscale image at a position (x 1, y 1), and R (x 1, y 1), G (x 1, y 1), B (x 1, y 1) respectively represent values of the three components; namely, the values of the three components of R, G and B in the three-channel image are respectively taken as the gray values of the gray image.

The graying formula of the maximum grayscale method is as follows:

f(x2,y2)＝max(R(x2,y2),G(x2,y2),B(x2,y2))

wherein f (x 2, y 2) is a pixel value of the grayed grayscale image at a position (x 2, y 2), and R (x 2, y 2), G (x 2, y 2), B (x 2, y 2) respectively represent values of the three components; that is, the maximum value of the three components at each pixel position is calculated first, and then the maximum value is regarded as a result of graying.

The graying formula of the average value gray scale method is as follows:

f(x3,y3)＝(R(x3,y3)+G(x3,y3)+B(x3,y3))/3

wherein f (x 3, y 3) is a pixel value of the grayed grayscale image at a position (x 3, y 3), and R (x 3, y 3), G (x 3, y 3), B (x 3, y 3) respectively represent an average value of the three components; that is, the average value of the three channel components is first calculated and then taken as the gradation value of the image.

The graying formula of the weighted average grayscale method is as follows:

f(x4,y4)＝0.3R(x4,y4)+0.59G(x4,y4)+0.11B(x4,y4)

wherein f (x 4, y 4) is a pixel value of the grayed grayscale image at a position (x 4, y 4), and R (x 4, y 4), G (x 4, y 4), B (x 4, y 4) respectively represent values of the three components; that is, different weights are assigned to the three components according to their respective importance or other needs, and then the weighted result is calculated and the weighted average value is grayed.

Specifically, in the function of displaying the image histogram, the grayscale histogram function expression of the image is as follows:

h(k)＝n _k k＝0,1,...,L-1

wherein L is the image gray scale, k is the pixel value P _i Gray scale of n _k The number of corresponding pixels is N, and the N is the total pixels of the image;

on the basis of the histogram, the normalized histogram is further defined as the relative frequency of occurrence of gray levels, i.e.

P _r (k)＝n _k /N。

Specifically, the calculation process of the histogram equalization function is as follows:

first, a gray histogram n of an original image is calculated _k ；

Secondly, calculating the total number N of pixels of the original image;

thirdly, calculating the gray distribution frequency P of the original image _r (k)；

Fourthly, calculating the gray scale cumulative distribution frequency S of the original image _k (ii) a Wherein, the first and the second end of the pipe are connected with each other,

the fifth step, normalizing S _k Multiplying by L-1 and rounding to make the gray level of the equalized image consistent with the original image before normalization;

and sixthly, according to the mapping relation, referring to the pixels in the original image, and obtaining the image after histogram equalization.

Specifically, in the image pseudo-color function, the following steps are required for performing pseudo-color on the gray-scale image:

1) Converting the scheme color space into an RGB space according to the color scheme, and corresponding the color information to R, G, B coordinates of the space one by one;

2) Designing a color matching scheme, establishing an RGB color mapping table, and corresponding the gray value to the corresponding R, G, B color coordinate;

3) Reading an image, and calculating the gray level of each point according to an RGB color mapping table to obtain color information to obtain color picture data;

4) And outputting the image, packaging the color into a standard bitmap format, and storing.

Specifically, the brightness adjustment function, the contrast adjustment function, the saturation adjustment function, and the image rain removal function are all in the prior art, and are not the key points of the present application, and therefore are not described herein again.

Specifically, in the image defogging function, when the image is defogged (as shown in fig. 13 to 14), the following formula is used for calculation:

wherein J (x) is an image (scene radius) after defogging and recovery; i (x) refers to the observed luminance, i.e. the luminance obtained from the captured picture; a is global atmospheric light (atmospheric light); t is the transmission from scene to camera; t is ₀ Is a preset threshold value when the value of T is less than T ₀ Let T = T ₀ In this embodiment, all the effect graphs are represented by T ₀ =0.1 for standard calculation;

and the formula for calculating the transmittance t is:

where w is a preset value, and w =0.95 in this embodiment.

Specifically, in the hough line detection function, the specific steps of detecting a line by using hough transform are as follows:

1) Converting the color image into a gray scale image;

2) Denoising by utilizing a Gaussian kernel, filtering noise information in the image, and removing interference;

3) Extracting the image edge by using an edge operator;

4) Binarizing the edge gray level image;

5) Mapping to a Hough space, preparing two containers, wherein one container is used for showing the general view of the Hough space, and the other container is used for storing voting values, and because a certain maximum value exceeds a threshold value in the voting process, which is thousands of times, the voting information cannot be directly recorded by using a gray map;

6) Taking a local maximum value, setting a threshold value, and filtering an interference straight line;

7) Drawing a straight line and calibrating an angular point.

In some embodiments, the image segmentation sub-module comprises a threshold binarization function, a region growing method segmentation function, and a watershed algorithm function; the segmentation process of the threshold binarization function is as follows:

wherein I (u, v) is the gray level of the element at the coordinate (u, v), k is the optimal threshold for threshold segmentation, and k is calculated by the following formula:

when obtained

When the maximum value is reached, the threshold k at the moment is the optimal threshold;

wherein the content of the first and second substances,

is a pixel region Z ₁ And a pixel region Z ₂ Maximum between-class variance of, and Z ₁ Represents a set of gray levels [0,1,2,3]Pixel region of (2), Z ₂ Represents a group of gray levels of [ k + 1.,. L-1 ]]A pixel region of (a); p ₁ (k) Representative pixel region Z ₁ Probability of occurrence, P ₂ (k) Representative pixel region Z ₂ Probability of occurrence of, and

P ₂ (k)＝1-P ₁ (k) (ii) a Due to the pixel region Z ₁ Has a mean value of gray levels of

Pixel region Z ₂ Has a mean value of gray levels of

So m is the cumulative mean up to gray level k, expressed as

And m is _g Expressed as the average gray scale of the whole image

The above formula is more convenient and faster in actual calculation because m _g Once calculated, and for all possible k, only two parameters P ₁ (k) And m needs to be calculated. The threshold k belongs to the range [0,L-1]Therefore, only continuous loop iteration is needed to obtain

The maximum threshold k is the optimal threshold. Of course, if the threshold k is not unique, the optimal threshold can be replaced by calculating the average thereof.

Specifically, when the segmentation function of the region growing method is used for processing an image, the region growing method comprises the following steps:

1. sequentially scanning the image to find the 1 st pixel which is not attributed yet, and setting the pixel as (x) ₀ ,y ₀ )；

2. With (x) ₀ ,y ₀ ) As a center, consider (x) ₀ ,y ₀ ) If (x) is (x, y) of 4 neighborhood pixels ₀ ,y ₀ ) Satisfying the growth criterion, and (x, y) and (x) ₀ ,y ₀ ) Merging in the same area, and simultaneously pushing (x, y) into a stack;

3. taking a pixel from the stack and treating it as (x) ₀ ,y ₀ ) Returning to the step 2;

4. returning to the step 1 when the stack is empty;

5. steps 1-4 are repeated until each point in the image has a home. And finishing the growth.

Specifically, when the watershed algorithm function processes the image, the whole process is as follows:

1) All pixels in the gradient image are classified according to gray values, and a geodesic distance threshold is set.

2) And finding out the pixel points with the minimum gray value (the default mark is the lowest gray value), and increasing the threshold from the minimum value, wherein the points are the starting points.

3) In the process of increasing the horizontal plane, the horizontal plane touches surrounding neighborhood pixels, the geodesic distance from the pixels to a starting point (the lowest point of the gray value) is measured, if the geodesic distance is smaller than a set threshold value, the pixels are submerged, otherwise, a dam is arranged on the pixels, and the neighborhood pixels are classified.

4) As the level increases, more and higher dams are placed, all meeting on the watershed lines up to the maximum of the gray values, and these dams partition the entire image pixels.

The above algorithm is used for performing watershed operation on an image, and a dense and rough small region may be obtained due to interference of noise points or other factors, that is, the image is too finely divided (over-segmented), because there are very many local minimum points in the image, each point may be a small region by itself.

When excessive segmentation occurs, the solution is as follows:

1) The image is gaussian smoothed to erase many small minima and these small partitions are merged.

2) Instead of growing from the minimum, one can take the relatively high gray value pixels as the starting point (requiring manual marking by the user), and flood from the mark, many small regions will be merged into one region.

Specifically, the image sharpening submodule comprises a gradient sharpening function, a Roberts operator, a Sobel operator and a Laplace operator; where the gradient represents a rate of change in the gray-scale value, the gradients of the image at the (x, y) point in the x-direction and the y-direction are as follows:

as can be seen from the above expression, the gradient of the image corresponds to the difference between 2 adjacent pixels.

The gradients in the x-direction and the y-direction may be represented together as a composite gradient of the image by:

specifically, the templates for the Roberts operator are as follows:

specifically, the template of the Sobel operator is as follows:

specifically, the basic flow of the laplacian includes:

1) Judging the gray value of the central pixel of the image and the gray values of other pixels around the central pixel, and if the gray value of the central pixel is higher, improving the gray value of the central pixel; otherwise, the gray scale of the central pixel is reduced, so that the image sharpening operation is realized;

2) In the algorithm implementation process, the Laplacian operator calculates the gradient in four directions or eight directions of the central pixel of the neighborhood, and then adds the gradients together to judge the relation between the gray level of the central pixel and the gray levels of other pixels in the neighborhood;

3) And finally, adjusting the gray level of the pixel according to the result of the gradient operation.

The Laplacian operator is divided into a four-neighborhood region and an eight-neighborhood region, wherein the four-neighborhood region is used for solving the gradient of four directions of a neighborhood central pixel, and the eight-neighborhood region is used for solving the gradient of eight directions.

The Laplacian operator four-neighborhood template is as follows:

the Laplacian operator eight-neighborhood template is as follows:

specifically, the image geometric transformation submodule includes an image translation function, an image rotation function, an image mirroring function, and an image scaling function, which are all in the prior art, and therefore, details are not described herein.

Specifically, the image arithmetic transformation submodule includes an addition and subtraction function, and if the image x1 and the image x2 are added and the output image is x, the function includes

If two images are subtracted, then

Specifically, the image logic operation submodule mainly performs an operation between two or more images on a pixel-by-pixel basis for a binary image. Common logic operations include and, or, not and xor, etc.

(a) And operation: defined as the portion of the a picture that is common to the B picture. Converting the binary image into a 0,1 value; circulating pixel points of the two images; and-calculating the pixel points and keeping the result.

(b) Or operation: the part defined as an a picture plus a B picture is a set composed of AB pictures together. Similar to the above, but the pixel points are ored and the result is retained.

(c) Non-operation: defined as the area of the image area excluding the a image. The specific operation is to invert the image pixel, i.e. the pixel point value is 1 to 0, and 0 to 1.

(d) And (3) XOR operation: defined as the A picture plus B picture portion, and then the overlap portion is removed. Specifically, the two pixel values are set to 0 if they are the same, and set to 1 if they are different.

Specifically, the morphology sub-module is mainly used for extracting image components meaningful for expressing and describing the shape of the region from the image, so that the subsequent recognition work can grasp the most essential (most distinguishing-most distinguishing) shape features of the target object, such as a boundary, a connected region and the like. The basic morphological operations of binary images, including dilation, erosion, opening and closing, are prior art and therefore not described in detail herein.

Specifically, the frequency domain analysis sub-module comprises a Fourier transform spectrogram function, a high/low pass filtering function and a homomorphic filtering function; in different fields of research, fourier transforms have many different variant forms, such as continuous fourier transforms and discrete fourier transforms.

A two-dimensional image f (x, y) of M rows and N columns is subjected to one-dimensional discrete Fourier transform with the length of N according to a row queue variable y, and then the Fourier transform result of the image can be obtained by performing one-dimensional discrete Fourier transform with the length of M on a variable x according to the column direction, wherein the formula is as follows:

the decomposition of the above formula is two parts, namely F (x, v) is obtained firstly, and then F (u, v) is obtained from F (x, v):

and the high-pass filter is: passing the high-frequency information and filtering the low-frequency information; the low-pass filtering is the opposite.

The ideal low pass filter template is:

where D0 represents the passband radius, D (u, v) is the distance (euclidean distance) to the center of the spectrum, and the calculation formula is as follows:

m and N represent the size of the spectrum image, and (M/2,N/2) is the center of the spectrum.

An ideal high pass filter is the inverse of 1 minus the low pass filter template.

When homomorphic filtering is applied, an image can be seen as being composed of two parts, i.e.

f(x,y)＝f _i (x,y)f _r (x,y)

Wherein f is _i Representing a light intensity (Illumination) component that varies with spatial position, characterized by a slow variation, centered in the low frequency part of the image. f. of _r Represents the reflected (reflection) component of the scene to the human eye. It features that it contains various information about scene and rich high-frequency components.

The homomorphic filtering process is divided into the following 5 basic steps:

1) Carrying out logarithmic transformation on the original image to obtain two additive components;

2) Fourier transform is carried out on the logarithmic image to obtain a corresponding frequency domain representation as follows:

DFT[lnf(x,y)]＝DFT[lnf _i (x,y)]+DFT[lnf _r (x,y)]

3) Designing a frequency domain filter H (u, v) to carry out frequency domain filtering on the logarithmic image;

4) Performing Fourier inversion, and returning a space domain logarithmic image;

5) And taking the index to obtain a spatial filtering result.

In some embodiments, the video processing module comprises a video anti-shake sub-module, a video enhancement sub-module, a video super-resolution sub-module, a background modeling sub-module, and a target detection sub-module; the video anti-shake sub-module, the video enhancement sub-module, the video super-resolution sub-module and the target detection sub-module are in the prior art, and are not described herein again. The background modeling submodule comprises a single Gaussian background modeling function, a mixed Gaussian background modeling function, a principal component background modeling function and a compressed sensing background modeling function.

Specifically, a single gaussian background model is applied to the single gaussian background modeling function, the principle of the single gaussian background modeling method is shown in fig. 17, the single gaussian background modeling method considers that the change of the pixel value of each pixel point in the image along with time is a random value, and the probability of the pixel value of the point meets the gaussian distribution principle. The probability density function of the target pixel point is as follows:

wherein, (x, y) is the coordinate of the target pixel point, I (x, y, r) is the pixel value of the target pixel point at the time t, mu _t Is the mean value, sigma, of the target pixel point (x, y) at the time t _t And the standard deviation of the target pixel point (x, y) at the time t is shown.

The single Gaussian background modeling method comprises the following specific steps:

step 1: initializing a background model; the first N frames of pictures in the target video are selected, the mean value of the pixel values is taken as the mean value of the model, the initial standard deviation selection range is 20-30, and the initial standard deviation in the embodiment is 25. The initialization is represented as follows:

step 2: detecting a moving target; and when the initialization of the background model is finished, starting to enter a moving target detection stage, and performing foreground detection by using a Gaussian distribution principle. The detection formula for foreground and background is defined as:

|I(x，y，t)-μ _t-1 (x，y)|＜λσ _t-1

|I(x，y，t)-μ _t-1 (x，y)|≥λσ _t-1

wherein, I (x, y, r) is the pixel value of the pixel point z (x, y) at the time t, μ _t-1 Is the mean value, sigma, of the background model at time t-1 _t-1 Is the standard deviation of the background model at the time of t-1, and lambda is the foreground judgment coefficient and the value range is between 2.5 and 3.0.

And step 3: updating a background model; and finally, a background model updating stage. The background will also change partially over time, and in order to be able to accurately detect moving objects in real time, the background model will also need to be updated accordingly in response to the change in background. The general updated principle is: when the current pixel point is detected as the foreground, the background model of the current pixel point is kept unchanged; when the current pixel point is detected as the background, the background model is updated according to the following mode:

μ _t (x，y)＝(1-α)μ _t-1 (x，y)+αI(x，y，t)

where α is the background learning rate, and 0< α <1.

According to the background model updating principle, only the pixel points judged as the background are updated, and the background model of the pixel points judged as the foreground is kept unchanged. The single-Gaussian background model is suitable for target detection in a single-modal scene.

Specifically, the Gaussian mixture background modeling function applies a Gaussian mixture background modeling algorithm, and the Gaussian mixture background model is that a plurality of single Gaussian background models are established for each pixel point in the background modeling process;

when three-channel video is processed, the r, g and b three-color channels of image pixel points are assumed to be mutually independent and have the same variance. Suppose an observed data set of a pixel point X at a certain position in an image is { X ₁ ,X ₂ ,...,X _t }，X _t ＝{r _t ,g _t ,b _t And (4) the sample of the pixel at the time t, wherein the color values of three channels r, g and b of the pixel at rt, gt and bt are respectively sampled, and then the mixed Gaussian distribution probability density function obeyed by a single sampling point Xt is as follows:

wherein, at time t, w _i,t Is the weight of the ith Gaussian distribution, η (X) _t ,μ _i,t ,∑ _i,t ) Is the ith Gaussian distribution probability, μ _i,t Is its mean value, Σ _i,t For the purpose of its covariance matrix,

is the variance, I is the three-dimensional identity matrix, k is the number of gaussian distributions, n is the number of pixel channels of the image, and n is 1 for a single channel image. The sum of the weights of the Gaussian mixture background model is 1, namely the sum is

The Gaussian mixture background modeling algorithm comprises the following specific steps:

step 1: inputting a video sequence;

step 2: initializing a model; acquiring first N frames of pixel information of a video sequence to construct a model, wherein the model is defined as follows:

detecting a foreground; after the model initialization is completed, at time t, each new pixel sample Xt is compared with the current k gaussian models one by one in the following way:

|X _t -μ _i，t-1 |≤2.5σ _i，t-1 i＝1，2，...，k

if the mathematical relationship between the new pixel sample and any one of the Gaussian models meets the above formula, the sample pixel is considered to be matched with the new pixel sample, and the background model corresponding to the sample pixel is updated; and if all the Gaussian models are not matched with the new sample pixel, replacing the mean value of the model with the minimum weight value in the k Gaussian models with the pixel value of the sample pixel to obtain a new background model, and using the replaced weight value of the Gaussian model.

Among the k Gaussian models, the Gaussian model with large weight and small variance is used for describing the background, the Gaussian model with large variance is used for describing the foreground, and because the background in the field of view in the video occupies a large proportion and the foreground occupies a small proportion, each model is based on the background.

And arranging the models in descending order, wherein the models with large weight and small standard deviation are arranged at the front, if the weights of the first M sequenced Gaussian models meet the formula (2-17), the first M Gaussian models of the pixel point are considered to be used for describing the background, and the rest Gaussian models are used for describing foreground moving objects.

Wherein, the T value is too small, which can cause M to be 1, and the model can become a single Gaussian model at the moment; if the value of T is too large, the model may excessively depict the background, so that part of the foreground is detected as the background, and detection omission is caused.

And step 3: updating the model; as can be seen from step 2, when the new pixel sample ge satisfies the formula condition, the background point of the current pixel point is considered, and the background model of the current pixel point needs to be updated, and the weight value updating formula is as follows:

ω _i，t ＝(1-α)ω _i，t-1 +αP _i，t

for matched models, P _i,t =1, for unmatched model, P _i,t =0, then normalize the weight of each model, alpha is the update rate of the weight, typically 0<α<1。

After the weight value is updated, the updating mode of the model parameters is as follows:

ρ＝α/ω _i，t

μ _i，t ＝(1-ρ)μ _i，t-1 +ρX _t

where p is the parameter update rate.

The mixed Gaussian background modeling avoids instability of single model detection by a method of establishing a plurality of models for each pixel point, improves adaptability of the background model to dynamic backgrounds, such as jittering leaves and flowing water waves, and small movements are not real moving targets.

Specifically, in the principal component background modeling function, the input data is typically a sequence of images captured by a fixed camera. Rearranging each frame of image into a column vector, and arranging all sequences of the image column vectors according to the time sequence to obtain a matrix D. The matrix D is composed of two parts: comparing the background part of the stable scene, and corresponding to a low-rank matrix A; and moving objects, corresponding to another matrix E. Namely:

D＝A+E

where matrix D is known, matrices a and E are unknown, and a is a low rank matrix.

When all elements in the matrix E obey independent Gaussian distribution with the same distribution, the optimal matrix A is obtained by performing optimization solution by adopting the following formula:

and performing singular value decomposition on the matrix D to obtain the optimal solution of the optimization problem. The assumption that all elements in E obey a gaussian distribution does not necessarily hold in the background modeling. Therefore, another assumption which is more consistent with the actual application scenario is adopted to replace the assumption, that is, the matrix E is a non-zero sparse matrix, and at this time, the problem of solving the low rank matrix a is converted into a two-target optimization problem:

min _A,E (rank(A),||E|| ₀ )s.t.D＝A+E

introducing a compromise factor constant lambda >0, and converting the above dual-target optimization problem into a single-target optimization problem as follows:

min _A,E rank(A)+λ||E|| ₀ s.t.D＝A+E

the objective function of the above-mentioned single-objective optimization problem is relaxed (since the kernel norm of the matrix is the envelope of the rank of the matrix; the (1,1) norm of the matrix is the convex hull of the 0 norm of the matrix), obtaining the following optimization problem:

min _A,E ||A|| _* +λ·||E|| _1,1 s.t.D＝A+E

the above optimization problem is solved by using an iterative threshold algorithm or an augmented lagrange multiplier method, which is the prior art and therefore is not described herein in detail.

In order to achieve automatic division of the video background, it is necessary to automatically divide a video into several parts each satisfying a low rank assumption. One frame image is selected as a reference frame, then 2 distance functions are defined, the distance between two frame images is measured through weighted average of the two frame images, when the distance between a certain frame image and the reference frame is less than a certain threshold value, the frame and the reference frame are considered to belong to one part, and if the distance is greater than the threshold value, the frame and the reference frame belong to the other part.

The background of the same partial image satisfying the low rank assumption is largely the same and the foreground may be different, which are different from both the background and the foreground of the other partial image, so that the difference between the image sequences satisfying the low rank assumption is relatively small, and the difference of subtraction between the frames after normalization is used to measure the inter-frame distance fd between two frames fa, fb in the n-frame image:

according to the practical situation of video background modeling, the sudden change of background divides the video into 2 parts in time, each part is usually continuous in time, so the time difference T after normalization is used _d To measure the temporal distance between two frames:

the inter-frame distance D and the inter-frame distance thereof of the two images are multiplied by a coefficient, respectively, and then added to obtain the distance between the two images, as shown below. When the distance is smaller than a threshold Th, the two images are judged to belong to the same part, and if the distance is larger than the threshold, the two images belong to different parts. Since the two coefficients and the threshold are relative, the coefficient for the inter-frame distance is 1, the coefficient for the temporal distance is λ, and the threshold is Th.

D(f _a ,f _b )＝f _d (f _a ,f _b )+λ·T _d (f _a ,f _b )

D(f _a ,f _b )≤Th

Then, several pieces of videos which conform to the local low-rank assumption and can be divided into two parts are found, the videos are manually divided, and then the values of λ and Th are adjusted, so that the result of dividing the videos can be well consistent with the result of manual division. Thus, suitable values for λ and Th are found.

In some embodiments, the background modeling function of the compressed sensing applies a background modeling algorithm based on the compressed sensing and the background difference, so that the foreground moving target can be accurately and robustly detected, the calculated amount is effectively reduced, and the time complexity of the algorithm is reduced; the background difference method requires that the established background model is robust enough, and a proper binarization threshold value needs to be set in the later processing so as to accurately detect the foreground target. The accuracy of the initial background image can influence the updating time of the background image in the subsequent process, so that the high-accuracy initial background image is necessary to be obtained; therefore, the present embodiment provides an improved initial background modeling method, and the modeling idea of the modeling method is as follows: firstly, an average value method is used for obtaining an average value of a plurality of frame images, then the average difference of the plurality of frame images is calculated, the average difference is used for removing the pixel value points with larger change in the frame images, finally, the frame images with the pixel value points with large change removed are averaged again, and the average value obtained at this time is used as the pixel value of the initial background image.

Specifically, the improved initial background modeling method includes the following specific modeling steps:

step 1: calculating an average Mean (x, y) of the N frames of images used to obtain the initial background image, wherein the calculation formula is as follows:

and 2, step: calculating the average difference MD (x, y) of the N frames of images, wherein the calculation formula is as follows:

MD(x,y)＝D(x,y)/N

and step 3: replacing the pixel value of the (N + 1) th frame image with the pixel value with larger change in the Nth frame image; that is, the pixel value of the next frame image is used to replace the pixel value with larger change in the current frame image;

if I _i (x, y) -Mean (x, y) | > μ MD (x, y), then I _i (x，y)＝I _i+1 (x，y)

Where i =0,1,2,.., N, set μ =2.5 (empirical value)

And 4, step 4: calculating the average value of the substituted N frames of images, and taking the average value as the pixel value of the initial background image:

after obtaining the initial background image, in order to reduce the transmission amount of image data, the obtained initial background image and the currently input frame image are firstly thinned by applying a compressed sensing technology, and then a measurement matrix is used to obtain their compressed measurement values, wherein the process is as follows:

step 1: and carrying out sparse representation on the obtained initial background image and the currently input frame image:

step 2: measuring the obtained sparse coefficient to obtain a compression measurement value of the image:

y _bn ＝Φθ _bn

y _tn ＝Φθ _tn

wherein n represents a time, and n =0,1,2,3.

Since the conventional background image updating strategy is designed for the pixel values of the image, and a small amount of compressed measurement values of the image are obtained after the compressed sensing technology is applied, and are no longer the pixel values of the image, when the updating strategy is designed, the designed object is changed into the compressed measurement values of the image. Since the difference image can be reconstructed by using the compressed measurement difference between the current frame image and the background image, the background can be updated by using the compressed measurement value between the current frame image and the background image. Now suppose I _n 、B _n Representing the current input frame image and the background image, y, respectively _tn Is a 1 _n Of the compressed measured value, y _bn Is a B _n The compressed measurement value of (1), then the background B at the time n +1 _n+1 Compressed measurement value y of _bn+1 Calculated from the formula:

wherein, alpha represents the updating speed of the model, is a constant, satisfies 0< alpha <1, and the reciprocal thereof represents the time constant in the attenuation process, and is generally an empirical value; m represents the number of compressed measurements; i represents the position of the corresponding compression measurement; y is _b0 Is a compressed measure of the original background image.

At position i, if the condition satisfies the following formula, it indicates that there is a moving object at this position.

|y _tn+1 (i)-y _tn (i)|＞T _n+1 (i)

Wherein, T _n+1 (i) Is a real-time updated threshold value, and the update strategy is expressed as follows:

where a is an integer close to 1, threshold T _n+1 (i) Can be changed in sizeAnd changing the value of a to adjust.

After the video image is obtained through the camera, the video image is transmitted into the target detection experiment software system in a frame-by-frame mode, and then the target detection experiment software system can automatically complete the updating of the background image according to the motion condition of the target, so that the influence caused by the change of the external environment is reduced to the minimum.

The difference between the measured values of the current input image and the background image is defined as:

r _tn ＝||y _tn (i)-y _bn (i)|| ₂

judging the difference value according to a threshold value, classifying the corresponding measured value in the difference image into a foreground or a background, and setting a judgment target function as follows:

the basis of the judgment is as follows: and if the difference value of the measured values of the current input image and the background image is smaller than the set threshold value, determining the measured value of the background, otherwise, determining the measured value of the foreground target. Generally, the setting of the decision threshold is very critical, because only if a proper decision threshold is set, the foreground moving object can be accurately segmented. If the threshold is too large, the foreground is easily determined as the background, and the target detection cannot be effectively performed. If the threshold is too small, the algorithm is sensitive to interference factors in the image, and small changes of the background in the scene can be mistakenly detected as foreground targets. The threshold value is generally set by two methods, namely, setting according to an empirical value and adaptively learning and setting in an algorithm. The fixed threshold value is set empirically, the method is relatively simple, but not flexible enough, and the robustness of the detection algorithm may be reduced in some scenarios. After the difference result is obtained through the set threshold, further processing is needed to obtain an accurate and complete background, for example, under the illumination change, the shadow of the moving target may be mistakenly detected as a foreground target, so that shadow interference needs to be removed; or a small amount of noise generated by the foreground before and after the difference can be processed by a certain denoising method to obtain a complete background and the like.

As shown in fig. 19-20, in some embodiments, the present application further discloses a method for unmanned boat sensing system detection, comprising

Obtaining an initial background image according to an initial video image captured by a sensor; ( When the obtained initial video image does not meet the requirement, the image processing module can be adopted to process the image, and if the identification capability of the system needs to be verified, the noise simulation submodule is used to add noise to the image; or when the image is not clear, other modules are used for processing to make the image clear. )

Comparing a current input frame image captured by a sensor with an initial background image to obtain a moving target or update the initial background image;

the method for acquiring the initial background image comprises the following steps:

obtaining an average Mean (x, y) and an average difference MD (x, y) of N frames of images in the initial video image;

replacing the pixel value of the (N + 1) th frame image with the pixel value with larger change in the N frame image, if I _i (x, y) -Mean (x, y) | > μ MD (x, y), then;

calculating the average value of the substituted N frames of images, wherein the average value is the pixel value of the initial background image;

wherein i =0,1,2, ·, N; μ =2.5.

Specifically, the calculation formula of the average Mean (x, y) is as follows:

wherein I is a pixel value.

Specifically, the calculation formula of the average difference MD (x, y) is as follows:

MD(x,y)＝D(x,y)/N。

specifically, an average value of the substituted N-frame image is calculated, where the average value is a pixel value of the initial background image, and a calculation formula of the average value of the substituted N-frame image is as follows:

specifically, the method for comparing the current input frame image captured by the sensor with the initial background image to obtain the moving target or update the initial background image comprises the following steps:

sparse representation is carried out on the initial background image and the current input frame image:

measuring the obtained sparse coefficient to obtain the compression measurement values of the initial background image and the current input frame image:

y _bn ＝Φθ _bn

y _tn ＝Φθ _tn

where n represents the time instant, n =0,1,2,3.

Obtaining a compression measurement value of the background image at the n +1 moment:

wherein, I _n Representing the current input frame image, B _n Representing a background image, y _tn Is the current input frame image I _n Of the compressed measured value, y _bn Is a background image B _n Compressed measurement of y _bn+1 Background image B at time n +1 _n+1 A compressed measurement of (a); the update speed of the representation model is constant and fullFoot 01, the inverse of which represents the time constant during the decay, typically an empirical value; m represents the number of compressed measurements; i represents the position of the corresponding compression measurement; y is _b0 Is a compressed measurement of the initial background image;

at position i, if the condition satisfies the following formula, it indicates that there is a moving object at this position:

|y _tn+1 (i)-y _tn (i)|＞T _n+1 (i)

where a is an integer close to 1, threshold T _n+1 (i) Can be adjusted by changing the value of a;

r _tn ＝||y _tn (i)-y _bn (i)|| ₂

and if the difference value of the measured values of the current input image and the background image is smaller than a set threshold value, determining the measured value of the background, otherwise, determining the measured value of the foreground target.

Any embodiment of the invention can be taken as an independent technical scheme, and can also be combined with other embodiments. All patents and publications cited herein are hereby incorporated by reference to the same extent as if each individual publication was specifically and individually indicated to be incorporated by reference. The invention herein may be practiced in the absence of any element or elements, limitation or limitations, which limitation or limitations is not specifically disclosed herein. The terms and expressions which have been employed herein are used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described, but it is recognized that various modifications and changes may be made within the scope of the invention and the claims which follow. It is to be understood that the embodiments described herein are examples and features of some embodiments and that modifications and variations may be made by one of ordinary skill in the art in light of the teachings of this disclosure, and are to be considered within the purview of this disclosure and scope of the appended claims and their equivalents.

Claims

1. An unmanned ship perception experiment platform device is characterized by comprising

2. The unmanned boat sensing experiment platform device of claim 1, wherein the pool environment structure comprises a pool body, an environment interference part and a water surface target; the environment interference part is arranged on the four walls of the tank body or the periphery of the tank body, and comprises a wave maker and/or a mist maker; the water surface target may float on the water surface within the tank body.

3. The unmanned boat sensory experiment platform device of claim 1, wherein the data acquisition structure comprises a mounting frame, a vision sensor and/or a distance measuring sensor detachably connected with the mounting frame.

4. The unmanned boat awareness experiment platform device of claim 1, wherein the target detection experiment software system comprises an image processing module and a video processing module; the image processing module comprises a noise simulation submodule, an image preprocessing submodule, an image segmentation submodule, an image smoothing submodule, an image sharpening submodule, an image geometric transformation submodule, an image arithmetic transformation submodule, an image logical operation submodule, a morphology submodule and a frequency domain analysis submodule.

5. The unmanned ship perception experiment platform device of claim 4, wherein the image preprocessing sub-module comprises a color image graying function, a displayed image histogram function, a histogram equalization function, a brightness adjustment function, a contrast adjustment function, a saturation adjustment function, an image pseudo-color function, an image defogging function, an image rain removal function, and a Hough line detection function.

6. The unmanned boat sensing experiment platform device of claim 5, wherein the color image graying function comprises a single component graying method, a maximum graying method, an average graying method and a weighted average graying method.

7. The unmanned ship perception experiment platform device of claim 6, wherein the graying formula of the single component gray scale method is as follows:

f(x1,y1)＝R(x1,y1)

or f (x 1, y 1) = G (x 1, y 1)

Or f (x 1, y 1) = B (x 1, y 1)

8. The unmanned ship perception experiment platform device of claim 6, wherein the graying formula of the maximum grayscale method is as follows:

f(x2,y2)＝max(R(x2,y2),G(x2,y2),B(x2,y2))

9. The unmanned boat sensing experiment platform device of claim 6, wherein the graying formula of the mean value grayscale method is as follows:

f(x3,y3)＝(R(x3,y3)+G(x3,y3)+B(x3,y3))/3

where f (x 3, y 3) is a pixel value of the grayed image at a position (x 3, y 3), and R (x 3, y 3), G (x 3, y 3), and B (x 3, y 3) respectively represent an average value of the three components.

10. The unmanned boat sensing experiment platform device of claim 6, wherein the graying formula of the weighted average grayscale method is as follows:

f(x4,y4)＝0.3R(x4,y4)+0.59G(x4,y4)+0.11B(x4,y4)