CN112016558A - Medium visibility identification method based on image quality - Google Patents

Medium visibility identification method based on image quality Download PDF

Info

Publication number
CN112016558A
CN112016558A CN202010868567.6A CN202010868567A CN112016558A CN 112016558 A CN112016558 A CN 112016558A CN 202010868567 A CN202010868567 A CN 202010868567A CN 112016558 A CN112016558 A CN 112016558A
Authority
CN
China
Prior art keywords
visibility
image
target
algorithm
binocular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010868567.6A
Other languages
Chinese (zh)
Inventor
王锡纲
李杨
赵育慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Xinwei Technology Co ltd
Original Assignee
Dalian Xinwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Xinwei Technology Co ltd filed Critical Dalian Xinwei Technology Co ltd
Priority to CN202010868567.6A priority Critical patent/CN112016558A/en
Publication of CN112016558A publication Critical patent/CN112016558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C3/00Measuring distances in line of sight; Optical rangefinders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Abstract

The invention relates to the technical field of visibility recognition, and provides a medium visibility recognition method based on image quality, which comprises the following steps: the method comprises the steps that video data of a target object are collected through a binocular camera, and visibility data are collected through a visibility tester to obtain two paths of video signals and visibility signals; respectively extracting the positions of the target objects from two paths of video signals collected by the binocular camera by using a target segmentation algorithm; carrying out feature matching on the obtained extraction result of the target object; obtaining distance information of a target object by using a binocular ranging algorithm, and further obtaining the deviation between the detection distance of the target object and the actual distance; predicting the visibility of each frame of image in two paths of video signals collected by a binocular camera by using an image quality prediction visibility algorithm, and predicting the obtained visibility interval; and (4) performing final visibility prediction by using a visibility balance algorithm. The invention can improve the accuracy of medium visibility identification and adapt to various environments.

Description

Medium visibility identification method based on image quality
Technical Field
The invention relates to the technical field of visibility recognition, in particular to a medium visibility recognition method based on image quality.
Background
Visibility identification is of great significance in the aspects of navigation, transportation and the like, various potential safety hazards can be caused by severe weather conditions and marine environments, life and property safety of people is concerned, and if relevant departments can accurately release corresponding visibility conditions, management quality can be improved in various industries.
Common visibility identification methods include manual visual measurement, measurement by instruments and the like. The manual visual inspection method judges the visibility by arranging a special observation station at each station, and the visual inspection method only depends on the discrimination of human eyes and subjective judgment, so that the normalization and the objectivity are poor; the visibility is calculated by measuring the transmissivity, the extinction coefficient and the like by equipment such as a transmission visibility meter, a laser radar visibility measuring instrument and the like by the measuring method, and the equipment has high price, high requirement on the field and large limitation, so the measuring method cannot be widely used.
Disclosure of Invention
The invention mainly solves the technical problems of high medium visibility identification price, small application range, low identification accuracy and the like in the prior art, and provides a medium visibility identification method based on image quality so as to achieve the purposes of improving the medium visibility identification accuracy and adapting to various environments.
The invention provides a medium visibility identification method based on image quality, which comprises the following processes:
step 100, acquiring video data of a target object through a binocular camera, and acquiring visibility data through a visibility tester to obtain two paths of video signals and visibility signals;
200, respectively extracting the positions of the target objects from two paths of video signals collected by a binocular camera by using a target segmentation algorithm to obtain the extraction results of the target objects;
step 300, performing feature matching on the obtained extraction result of the target object;
step 400, obtaining distance information of a target object by using a binocular ranging algorithm, and further obtaining a deviation between a detection distance of the target object and an actual distance;
500, performing image quality prediction visibility prediction on each frame of image in two paths of video signals collected by a binocular camera by using an image quality prediction visibility algorithm, and predicting an obtained visibility interval;
and 600, performing final visibility prediction by using a visibility balance algorithm.
Further, step 200 includes the following process:
step 201, performing convolutional neural network extraction on each frame image in two paths of video signals of a video to extract features;
step 202, performing primary classification and regression by using a regional extraction network;
step 203, carrying out alignment operation on the candidate frame feature map;
and 204, classifying, regressing and segmenting the target by using the convolutional neural network to obtain an extraction result of the target object.
Further, step 300 includes the following process:
step 301, extracting key points of the outlines of the two target objects;
step 302, positioning key points of the obtained key points;
step 303, determining a feature vector of the key point according to the positioned key point;
and step 304, matching the key points through the feature vectors of the key points.
Further, step 400 includes the following process:
step 401, calibrating a binocular camera;
step 402, performing binocular correction on a binocular camera;
step 403, performing binocular matching on the images acquired by the binocular cameras;
and step 404, calculating the depth information of the image after binocular matching to obtain the distance information of the target object in the image.
Further, step 500 includes the following process:
step 501, segmenting an image to realize identification and positioning of a target;
and 502, predicting the visibility of the image according to the identification and positioning results of the target to obtain an image classification result.
Further, step 600 includes the following process:
601, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure comprises an input layer, a recurrent neural network, a full connection layer and a visibility interval output layer;
step 602, sequentially inputting visibility into a recurrent neural network to obtain a result considering a time sequence;
step 603, connecting the output of the recurrent neural network with a full connection layer to obtain the visibility interval value corresponding to the time sequence.
The invention provides a medium visibility identification method based on image quality, which utilizes a binocular camera to capture video images all weather, utilizes a distance measurement algorithm to obtain the deviation between a target object detection distance and an actual distance, utilizes the image quality to predict a visibility interval obtained by the visibility algorithm, and utilizes a visibility balance algorithm according to the result to finally predict the visibility. The invention can identify the current medium visibility, has high accuracy and stability for identifying the medium visibility, has strong adaptability to various common conditions, and does not depend on specific video acquisition equipment. According to the invention, each point position uses a binocular camera to acquire video data. The use of the binocular camera has the effect of achieving multiple purposes. The two lenses can be used independently, each lens can be used as an independent video signal source, two paths of signals are subjected to cross validation, and the sensitivity to distance can be increased by combined use.
The method can be applied to submarine visibility identification, atmospheric visibility identification of harbor areas and other scenes needing medium visibility identification. When atmospheric visibility of a port area is identified, analysis of port area application scenes can find that the port area is large and the operation area is wide in distribution, so that multipoint deployment is needed for identifying point locations according to the operation area. The construction in the harbor area is relatively mature, and the landform characteristics and the building appearance are relatively stable. And a detection reference point is set at each point position conveniently, so that the stability and the accuracy of identification are improved. The binocular cameras are deployed at multiple points in the harbor district, and video data of all point positions at the same time can be obtained through timestamp control of the system. Meanwhile, the video data is an image sequence in a time dimension, so that the atmospheric visibility data of different time periods and different places can be obtained by the method of the invention and can be used by port service personnel.
Drawings
FIG. 1 is a flow chart of an implementation of a method for identifying visibility of a medium based on image quality according to the present invention;
FIG. 2 is a schematic diagram of a feature pyramid network structure;
FIG. 3 is a schematic view of a bottom-up configuration;
FIG. 4 is a schematic diagram of the generation of a feature map for each stage in a bottom-up configuration;
FIG. 5 is a schematic diagram of a regional extraction network architecture;
FIG. 6 is an effect diagram of an alignment operation performed on a feature map;
FIG. 7 is a schematic diagram of a classification, regression, segmentation network architecture;
FIG. 8 is a schematic diagram of a binocular ranging algorithm;
FIG. 9 is a schematic diagram of the basic principle of binocular ranging;
FIG. 10 is a schematic diagram of an image segmentation network architecture;
FIG. 11 is a schematic diagram of an image visibility prediction network architecture;
FIG. 12 is a schematic diagram of a visibility balancing algorithm network structure;
fig. 13 is a schematic diagram of a recurrent neural network architecture.
Detailed Description
In order to make the technical problems solved, technical solutions adopted and technical effects achieved by the present invention clearer, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings.
FIG. 1 is a flow chart of an implementation of a medium visibility identification method based on image quality provided by the invention. As shown in fig. 1, a method for identifying visibility of a medium based on image quality according to an embodiment of the present invention includes:
and step 100, acquiring video data of a target object through a binocular camera, and acquiring visibility data through a visibility tester to obtain two paths of video signals and visibility signals.
Since the output of visibility recognition in the present invention is a discrete value, i.e., a digital interval, as described by "500 meters or less" and "500 meters to 1000 meters". Therefore, in order to improve the detection accuracy, the discrete values detected by other algorithms are corrected by using the continuous values obtained by the ranging algorithm, namely the detection distances, such as the descriptions of "245.87 meters" and "1835.64 meters". And therefore a target reference needs to be set. Selection principle of target reference object: a stationary object that is not positionally fixed; under the condition of good visibility, objects can be clearly identified in the daytime and at night; no shielding exists between the binocular camera and the target object; the distance between the target object and the binocular camera accords with the distribution of the visibility interval and is distributed uniformly. The distance difference is preferably about 100 m.
And 200, respectively extracting the positions of the target objects from the two paths of video signals collected by the binocular camera by using a target segmentation algorithm to obtain the extraction results of the target objects.
The positions of the target objects are respectively extracted from two paths of video signals of the binocular camera, and the purpose is to prevent the failure of subsequent calculation caused by the error of single-path detection. In the step, an accurate target segmentation algorithm is adopted to obtain the accurate outline of the target object, so that the accurate position of the target object can be extracted to prepare for subsequent processing.
Considering that the field of view (angle of view, focal length, etc.) of the binocular camera does not easily change, the position where the target object appears in the image is theoretically fixed. In practice, however, factors such as slight change of the visual field due to shaking of the camera under the action of wind, sea waves and the like or slight change of the visual field due to other external forces, and even appearance of interfering objects such as birds, fish schools and the like in the visual field must be considered, in order to increase the detection accuracy, when the target segmentation algorithm is used for processing, a hot spot region in the visual field is set according to a priori condition, and the weight of the target object detected in the hot spot region is increased.
Through a target segmentation algorithm, the accurate outlines of two target objects can be theoretically obtained in two paths of video frame signals of the binocular camera. The "exact profile" as referred to herein may be disturbed by different conditions in the medium, and the "exact profile" may not be detected differently under different visibility conditions. We tolerate this disturbance in this context because it is precisely the information that contains visibility. If two 'precise contours' are not obtained, the frame data identification is wrong, or the identification cannot be normally carried out due to some reasons, such as abnormal acquisition of a certain video signal, or a certain lens is blocked.
If the above condition is not satisfied, the frame data is discarded, and the next frame data is input and re-identified. When the method is applied practically, if the situation occurs continuously in multiple frames, an alarm needs to be given, and the video signal is stored and is left for the inspection of business personnel.
The target segmentation of the step is the first step of image analysis, is the basis of computer vision, is an important component of image understanding, and is also one of the most difficult problems in image processing. The image segmentation means that an image is divided into a plurality of mutually disjoint areas according to characteristics such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. In brief, in one image, the object is separated from the background. The image segmentation greatly reduces the data volume to be processed in the subsequent image analysis, target recognition and other advanced processing stages, and simultaneously retains the information about the structural characteristics of the image.
The target segmentation algorithm mainly comprises the following steps: threshold-based segmentation methods, region-based segmentation methods, edge-based segmentation methods, and deep learning-based segmentation methods, among others. The main process of the target segmentation algorithm adopted in the step comprises the following steps:
step 201, performing convolutional neural network extraction on each frame image in the two paths of video signals to extract features.
In the step, the definition of the image is considered to change along with the difference of camera parameters, so a multi-scale feature extraction scheme, namely a feature pyramid network, is adopted. The feature pyramid network structure is shown in fig. 2.
The feature pyramid network is divided into two parts of structure. The left structure is called the bottom-up structure, which yields a feature map of different dimensions, as shown at C1 through C5. C1 to C5 are feature maps with different scales, and the feature map size becomes smaller from bottom to top, which also means that the feature dimension of extraction is higher and higher. The shape is pyramid-shaped, thus becoming a characteristic pyramid network. The right structure is called a top-down structure and respectively corresponds to each layer of characteristics of the characteristic pyramid, and arrows of characteristic processing connection at the same level between the two structures are transversely connected.
The purpose of doing so is because the higher-level characteristic that the size is less has more semantic information, and the lower-level characteristic semantic information that the size is great is few but positional information is many, through such connection, the characteristic map of each layer has all fused the characteristic of different resolutions and different semantic intensity, therefore when detecting the object of different resolutions, detection effect can obtain promoting.
The bottom-up structure is shown in fig. 3, and the network structure includes five stages, each for calculating a feature map of a different size, with a scaling step size of 2. The principle of generating a profile at each stage is shown in fig. 4. We use the C1, C2, C3, C4, C5 feature maps of each stage output for constructing a feature pyramid network structure.
The top-down structure is shown on the right side of the word tower network structure in fig. 2. Firstly, the high-level feature map with stronger semantic information is up-sampled to obtain the same size as the low-level feature map. Then, the feature maps in the bottom-up and top-down structures having the same dimensions are connected laterally. And combining the two feature map mappings in an element addition mode. Finally, in order to reduce aliasing effect caused by up-sampling, a convolution layer is added to each combined feature map to obtain final feature maps, namely P2, P3, P4 and P5.
Step 202, using the regional extraction network to perform preliminary classification and regression.
The area extraction network structure is shown in fig. 5. Based on the feature maps P2, P3, P4 and P5 obtained by the feature pyramid network, firstly, generating an anchor frame of the original map corresponding to each point on the feature map according to an anchor frame generation rule, then inputting the P2, P3, P4 and P5 feature maps into an area extraction network, wherein the area extraction network comprises a convolution layer and a full connection layer, and finally obtaining classification and regression results of each anchor frame, wherein the classification and regression results specifically comprise foreground and background classification scores of each anchor frame and boundary frame coordinate correction information of each anchor frame. And finally, selecting an anchor frame meeting the foreground score condition according to the threshold value and correcting the boundary frame, wherein the corrected anchor frame is called a candidate frame.
And step 203, performing alignment operation on the candidate frame feature map.
And obtaining candidate frames meeting the score requirement through a regional extraction network, and mapping the candidate frames back to the feature map. And obtaining the layer number of the feature map corresponding to the candidate frame according to the following formula:
Figure BDA0002650489590000071
wherein w represents the width of the candidate frame, h represents the height of the candidate frame, k represents the number of feature layer layers corresponding to the candidate frame, and k represents the number of feature layer layers corresponding to the candidate frame0The number of layers mapped when w, h is 224 is generally 4, i.e., P4 layers. Then, a feature map corresponding to the candidate frame is obtained by a bilinear interpolation method, and the obtained feature map is consistent in size. The effect of the alignment operation on the feature map is shown in fig. 6.
And 204, classifying, regressing and segmenting the target by using the convolutional neural network to obtain an extraction result of the target object.
The classification, regression, and segmentation network structure is shown in fig. 7. And calculating the classification score and the coordinate offset of the candidate frame through classification and regression networks based on the obtained candidate frame feature map with the fixed size, and performing boundary frame correction on the candidate frame. And segmenting the target in the candidate frame through the segmentation network. Finally, classification, boundary box regression and segmentation results of the targets in the image can be obtained through a target segmentation algorithm, and further an extraction result of the target object is obtained.
And 300, performing feature matching on the obtained extraction result of the target object.
Through the target segmentation algorithm in step 200, two target contours are obtained, but the positions and angles of the two target contours in different video frames are different, which requires feature matching of the two target contours. The feature matching algorithm needs to compare the features of the two target object profiles to find out the same point of the same object at different positions in the imaging. Because of the subsequent ranging algorithm, calculation must be performed according to a certain determined pixel point. In this link, in order to ensure that the same point is extracted as much as possible, a method of sampling and averaging for multiple times is performed to determine the final result. And the pixel position of the spot in the different images is recorded. The method specifically comprises the following steps:
step 301, extracting key points of the two target object outlines.
The key points are some very prominent points which cannot disappear due to factors such as illumination, dimension, rotation and the like, such as corner points, edge points, bright points in dark areas and dark points in bright areas. This step is to search for image positions on all scale spaces. Potential points of interest with scale and rotation invariance are identified by gaussian derivative functions.
And 302, positioning the key points of the obtained key points.
At each candidate location, the location and scale are determined by fitting a fine model. The selection of the key points depends on their degree of stability.
Step 303, determining a feature vector of the key point according to the located key point.
One or more directions are assigned to each keypoint location based on the local gradient direction of the image. All subsequent operations on the image data are transformed with respect to the orientation, scale and location of the keypoints, providing invariance to these transformations.
And step 304, matching the key points through the feature vectors of the key points.
And comparing every two characteristic vectors of each key point to find out a plurality of pairs of mutually matched characteristic points and establish the corresponding relation of the characteristics between the objects. Finally, the distance between the key points can be calculated through the corresponding relation.
And step 400, obtaining distance information of the target object by using a binocular ranging algorithm, and further obtaining the deviation between the detection distance of the target object and the actual distance.
A schematic diagram of the binocular ranging algorithm is shown in fig. 8. As can be seen from fig. 8, the error of the ranging algorithm is affected by the measurement error of the distance between the left and right cameras, the measurement error of the focal length of the camera, the measurement error of the vertical height difference between the camera and the target, and other factors. These errors are unavoidable. However, the step is not to measure the precise distance of the target object, but to establish the correlation between the actual distance and the detected distance under the influence of different visibility conditions. And due to the existence of the subsequent neural network, the influence of the error generated in the step can be reduced through the subsequent neural network. The output value of the ranging algorithm is a detected distance value (continuous value). The basic principle of binocular ranging is shown in fig. 9. The method specifically comprises the following steps:
step 401, calibrating the binocular camera.
The camera has radial distortion due to the characteristics of the optical lens, and can be determined by three parameters k1, k2 and k3, wherein the formula of the radial distortion is as follows: xdr=X(1+k1×r2+k2×r4+k3×r6),Ydr=Y(1+k1×r2+k2×r4+k3×r6),r2=X2+Y2Wherein (X, Y) is the pixel point coordinate of the undistorted image, (X)dr,Ydr) Is the distorted image pixel point coordinates; due to assembly errors, the sensor of the camera and the optical lens are not completely parallel, so that tangential distortion exists in imaging, which can be determined by two parameters p1 and p2, and the formula of the tangential distortion is as follows: xdt=2p1×X×Y+p2(r2+2×X2)+1,Ydt=2p1(r2+2×Y2)+2p2XXXY +1, where (X, Y) is the coordinates of pixel points in the undistorted image, (X, Y)dt,Ydt) And the coordinates of the pixel points of the distorted image. The calibration of a single camera mainly comprises the calculation of internal parameters (focal length f and imaging original points cx and cy, five distortion parameters (generally only k1, k2, p1 and p2 need to be calculated, and k3 needs to be calculated only when the radial distortion of a fisheye lens is particularly large)) and external parameters (world coordinates of a calibration object). The calibration of the binocular camera needs not only to obtain internal parameters of each camera, but also to measure the relative position between the two cameras (i.e. the rotation matrix R and the translation vector t of the right camera relative to the left camera) through calibration.
And step 402, performing binocular correction on the binocular camera.
The binocular correction is to respectively eliminate distortion and align lines of left and right views according to monocular internal reference data (focal length, imaging origin, distortion coefficient) and binocular relative position relationship (rotation matrix and translation vector) obtained after the cameras are calibrated, so that the imaging origin coordinates of the left and right views are consistent, the optical axes of the two cameras are parallel, the left and right imaging planes are coplanar, and the epipolar lines are aligned. Therefore, any point on one image and the corresponding point on the other image have the same line number, and the corresponding point can be matched only by one-dimensional search on the line.
And step 403, performing binocular matching on the images acquired by the binocular cameras.
The binocular matching is used for matching corresponding image points of the same scene on left and right views, and the purpose of the binocular matching is to obtain parallax data.
And step 404, calculating the depth information of the image after binocular matching to obtain the distance information of the target object in the image.
P is a certain point on the object to be measured, L and R are the optical centers of the left camera and the right camera respectively, the imaging points of the point P on the photoreceptors of the two cameras are P and P ' respectively (the imaging plane of the camera is placed in front of the lens after rotating), f represents the focal length of the camera, b represents the center distance of the two cameras, z represents the distance of the target object to be obtained, and if the distance between P and P ' is dis, the distance between P and P ' is calculated
dis=b-(XR-XL)
According to the triangle similarity principle:
Figure BDA0002650489590000091
the following can be obtained:
Figure BDA0002650489590000092
in the formula, the focal length f and the camera center distance b can be obtained by calibration, so long as X is obtainedR-XLThe value of (i.e., disparity d) is the depth information. The disparity values may be calculated based on matching keypoints in a second feature matching algorithm. Finally, distance information of the target object in the image can be obtained through a binocular distance measurement algorithm, and further the deviation between the detection distance of the target object and the actual distance is obtained.
And 500, performing image quality prediction visibility prediction on each frame of image in the two paths of video signals collected by the binocular camera by using an image quality prediction visibility algorithm, and predicting the obtained visibility interval.
The visibility algorithm for predicting image quality is a method for predicting visibility by using image macroscopic information, and mainly predicts visibility based on the definition and contrast of an object under a medium background in an image. The link can not directly receive video frame signals, but needs to filter out close-range high-frequency information in the images, extract low-frequency information of a medium background, and then analyze and predict the images. In order to adapt the method to the day and the night and improve the prediction accuracy under different conditions, a large amount of video data and detection data of the visibility detector with the same timestamp need to be provided in the training process of the algorithm. The output of this step is an interval value (discrete value) of visibility. The method mainly comprises the following two steps:
and 501, segmenting the image to realize the identification and positioning of the target.
The image segmentation network structure is shown in fig. 10. The image is subjected to feature extraction through three rolling blocks, then two full-connection layers are connected to obtain the classification score and the position of a boundary frame of a target in the image, finally the highest score is selected as output, the boundary frame which most possibly has the target is extracted, and therefore the purpose of identifying and positioning the target is achieved.
And 502, predicting the visibility of the image according to the identification and positioning results of the target to obtain an image classification result.
The image visibility prediction network structure is shown in fig. 11. Based on the image segmentation result, the predicted visibility image can be obtained. Because the image scene is relatively complex, the network structure comprises three modules, each module uses four different convolution kernels for extracting the features of the image with different scales, the feature diversity is increased, and the classification accuracy is improved. And splicing and combining the extracted various features on the channel dimension at the output end of each module to obtain a multi-scale feature map. And finally, classifying the images through the full connection layer to obtain an image classification result, and realizing the visibility interval obtained by prediction.
And 600, performing final visibility prediction by using a visibility balance algorithm.
Through the processing procedures of steps 200-500, two visibility-related results are obtained in one frame of video data: (1) in step 400, the deviation (continuous value) between the detected distance and the actual distance of the target object is obtained by using a ranging algorithm. The occurrence of this deviation is strongly and directly linked to visibility. (2) In step 500, the visibility interval (discrete value) obtained by the visibility algorithm is predicted by using the image quality.
For a balancing strategy of a plurality of calculation results, a method of directly taking a mean value or taking a mean value after eliminating an abnormal value is conventionally adopted. In order to further improve the detection accuracy, a method of multi-frame result cyclic check is adopted in the step. In a short time (such as 1 minute) compared with the visibility change speed, obtaining multi-frame data for calculation according to a certain time interval (such as 5 seconds), and inputting the detection result of each frame into a visibility balance algorithm according to a time sequence to obtain a final visibility interval value. Step 600 includes the following processes:
step 601, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure comprises an input layer, a recurrent neural network, a full connection layer and a visibility interval output layer.
The visibility balance algorithm network structure is shown in fig. 12. As shown in fig. 12, the visibility balance algorithm network structure includes an input layer, a recurrent neural network, a full link layer, and a visibility interval output layer. The visibility balance algorithm network inputs visibility according to a time sequence, the visibility characteristic length input by each time node is 3, and the visibility characteristic length is a deviation between a target detection distance obtained by the distance measurement algorithm in the step 400 and an actual distance, and a visibility interval obtained by the image quality prediction visibility algorithm in the step 500.
Step 602, sequentially inputting visibility into a recurrent neural network to obtain a result considering a time sequence.
The invention can balance the multiple calculation results in the time dimension by using the visibility balance algorithm, and can reduce the influence of single-frame calculation errors. But the results obtained at different time stamps have a certain correlation before and after, i.e. the visibility does not change very drastically in a short time. The multiple detected values can be corrected using the time dimension. Visibility balances are first handled using a recurrent neural network. The cyclic neural network is characterized in that the result of the last calculation needs to be considered in each calculation and is used as prior input, and the effect of correcting the subsequent calculation can be realized. After correction, calculation results of different time stamps are obtained, and then a full-connection neural network is connected, and the calculation results of multiple times are integrated to obtain a final result. The recurrent neural network structure is shown in fig. 13. The recurrent neural network structure includes: an input layer, a circulation layer, and an output layer.
The recurrent neural network has the property of learning recursively in the order of the input data and can therefore be used to process data relating to the sequence. As can be seen from the network structure, the recurrent neural network memorizes the previous information and uses the previous information to influence the output of the following nodes. That is, the nodes between the hidden layers of the recurrent neural network are connected, and the input of the hidden layer includes not only the output of the input layer but also the output of the previous hidden layer.
Given data input in sequence, X ═ X1,X2,…,XtAnd the characteristic length of X is c, and the expansion length is t. Output h of the recurrent neural networktThe calculation formula is as follows:
ht=tanh(W*Xt+W*ht-1)
where W is the hidden layer parameter and tanh is the activation function. As can be seen from the formula, the output at time t depends not only on the input X at the current timetAlso dependent on the output h of the previous instantt-1
Step 603, connecting the output of the recurrent neural network with a full connection layer to obtain the visibility interval value corresponding to the time sequence.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: modifications of the technical solutions described in the embodiments or equivalent replacements of some or all technical features may be made without departing from the scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. A medium visibility identification method based on image quality is characterized by comprising the following processes:
step 100, acquiring video data of a target object through a binocular camera, and acquiring visibility data through a visibility tester to obtain two paths of video signals and visibility signals;
200, respectively extracting the positions of the target objects from two paths of video signals collected by a binocular camera by using a target segmentation algorithm to obtain the extraction results of the target objects;
step 300, performing feature matching on the obtained extraction result of the target object;
step 400, obtaining distance information of a target object by using a binocular ranging algorithm, and further obtaining a deviation between a detection distance of the target object and an actual distance;
500, performing image quality prediction visibility prediction on each frame of image in two paths of video signals collected by a binocular camera by using an image quality prediction visibility algorithm, and predicting an obtained visibility interval;
and 600, performing final visibility prediction by using a visibility balance algorithm.
2. The image-quality-based medium visibility recognition method according to claim 1, wherein the step 200 comprises the following steps:
step 201, performing convolutional neural network extraction on each frame image in two paths of video signals of a video to extract features;
step 202, performing primary classification and regression by using a regional extraction network;
step 203, carrying out alignment operation on the candidate frame feature map;
and 204, classifying, regressing and segmenting the target by using the convolutional neural network to obtain an extraction result of the target object.
3. The image-quality-based medium visibility recognition method according to claim 1, wherein the step 300 comprises the following processes:
step 301, extracting key points of the outlines of the two target objects;
step 302, positioning key points of the obtained key points;
step 303, determining a feature vector of the key point according to the positioned key point;
and step 304, matching the key points through the feature vectors of the key points.
4. The method for identifying visibility of medium based on image quality as claimed in claim 1, wherein the step 400 comprises the following steps:
step 401, calibrating a binocular camera;
step 402, performing binocular correction on a binocular camera;
step 403, performing binocular matching on the images acquired by the binocular cameras;
and step 404, calculating the depth information of the image after binocular matching to obtain the distance information of the target object in the image.
5. The image-quality-based medium visibility recognition method according to claim 1, wherein the step 500 comprises the following steps:
step 501, segmenting an image to realize identification and positioning of a target;
and 502, predicting the visibility of the image according to the identification and positioning results of the target to obtain an image classification result.
6. The image-quality-based medium visibility recognition method according to claim 1, wherein the step 600 comprises the following processes:
601, constructing a visibility balance algorithm network structure, wherein the visibility balance algorithm network structure comprises an input layer, a recurrent neural network, a full connection layer and a visibility interval output layer;
step 602, sequentially inputting visibility into a recurrent neural network to obtain a result considering a time sequence;
step 603, connecting the output of the recurrent neural network with a full connection layer to obtain the visibility interval value corresponding to the time sequence.
CN202010868567.6A 2020-08-26 2020-08-26 Medium visibility identification method based on image quality Pending CN112016558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010868567.6A CN112016558A (en) 2020-08-26 2020-08-26 Medium visibility identification method based on image quality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010868567.6A CN112016558A (en) 2020-08-26 2020-08-26 Medium visibility identification method based on image quality

Publications (1)

Publication Number Publication Date
CN112016558A true CN112016558A (en) 2020-12-01

Family

ID=73502217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010868567.6A Pending CN112016558A (en) 2020-08-26 2020-08-26 Medium visibility identification method based on image quality

Country Status (1)

Country Link
CN (1) CN112016558A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538350A (en) * 2021-06-29 2021-10-22 河北深保投资发展有限公司 Method for identifying depth of foundation pit based on multiple cameras
CN114216613A (en) * 2021-12-14 2022-03-22 浙江浙能技术研究院有限公司 Gas leakage amount measuring method based on binocular camera

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100715140B1 (en) * 2006-05-23 2007-05-08 (주)비알유 Visibility measuring apparatus and method
CN104677330A (en) * 2013-11-29 2015-06-03 哈尔滨智晟天诚科技开发有限公司 Small binocular stereoscopic vision ranging system
CN110263706A (en) * 2019-06-19 2019-09-20 南京邮电大学 A kind of haze weather Vehicular video Detection dynamic target and know method for distinguishing
CN110849807A (en) * 2019-11-22 2020-02-28 山东交通学院 Monitoring method and system suitable for road visibility based on deep learning
CN111191629A (en) * 2020-01-07 2020-05-22 中国人民解放军国防科技大学 Multi-target-based image visibility detection method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100715140B1 (en) * 2006-05-23 2007-05-08 (주)비알유 Visibility measuring apparatus and method
CN104677330A (en) * 2013-11-29 2015-06-03 哈尔滨智晟天诚科技开发有限公司 Small binocular stereoscopic vision ranging system
CN110263706A (en) * 2019-06-19 2019-09-20 南京邮电大学 A kind of haze weather Vehicular video Detection dynamic target and know method for distinguishing
CN110849807A (en) * 2019-11-22 2020-02-28 山东交通学院 Monitoring method and system suitable for road visibility based on deep learning
CN111191629A (en) * 2020-01-07 2020-05-22 中国人民解放军国防科技大学 Multi-target-based image visibility detection method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538350A (en) * 2021-06-29 2021-10-22 河北深保投资发展有限公司 Method for identifying depth of foundation pit based on multiple cameras
CN113538350B (en) * 2021-06-29 2022-10-04 河北深保投资发展有限公司 Method for identifying depth of foundation pit based on multiple cameras
CN114216613A (en) * 2021-12-14 2022-03-22 浙江浙能技术研究院有限公司 Gas leakage amount measuring method based on binocular camera

Similar Documents

Publication Publication Date Title
Koch et al. Evaluation of cnn-based single-image depth estimation methods
CN109615611B (en) Inspection image-based insulator self-explosion defect detection method
CN110675418B (en) Target track optimization method based on DS evidence theory
US8331654B2 (en) Stereo-image registration and change detection system and method
CN104574393B (en) A kind of three-dimensional pavement crack pattern picture generates system and method
CN108920584A (en) A kind of semanteme grating map generation method and its device
CN110689562A (en) Trajectory loop detection optimization method based on generation of countermeasure network
CN108981672A (en) Hatch door real-time location method based on monocular robot in conjunction with distance measuring sensor
CN107560592B (en) Precise distance measurement method for photoelectric tracker linkage target
CN110288659B (en) Depth imaging and information acquisition method based on binocular vision
CN111462128B (en) Pixel-level image segmentation system and method based on multi-mode spectrum image
JP6858415B2 (en) Sea level measurement system, sea level measurement method and sea level measurement program
CN109341668A (en) Polyphaser measurement method based on refraction projection model and beam ray tracing method
CN109657717A (en) A kind of heterologous image matching method based on multiple dimensioned close packed structure feature extraction
CN112016558A (en) Medium visibility identification method based on image quality
Yao et al. Automatic scan registration using 3D linear and planar features
CN115797408A (en) Target tracking method and device fusing multi-view image and three-dimensional point cloud
CN105787870A (en) Graphic image splicing fusion system
CN114973028A (en) Aerial video image real-time change detection method and system
CN112712566B (en) Binocular stereo vision sensor measuring method based on structure parameter online correction
CN113219472A (en) Distance measuring system and method
Krotosky et al. Multimodal stereo image registration for pedestrian detection
CN103688289A (en) Method and system for estimating a similarity between two binary images
JPH11250252A (en) Three-dimensional object recognizing device and method therefor
Koppanyi et al. Experiences with acquiring highly redundant spatial data to support driverless vehicle technologies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination