CN117132468A

CN117132468A - Curvelet coefficient prediction-based super-resolution reconstruction method for precise measurement image

Info

Publication number: CN117132468A
Application number: CN202311060396.4A
Authority: CN
Inventors: 吴福培; 梁家烨; 谭鑫磊; 鲁晓会; 刘宇豪
Original assignee: Shantou University
Current assignee: Shantou University
Priority date: 2023-07-11
Filing date: 2023-08-22
Publication date: 2023-11-28
Anticipated expiration: 2043-08-22
Also published as: CN117132468B

Abstract

The embodiment of the invention discloses a precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction, which comprises the steps of designing an image super-resolution neural network, designing a deep learning loss function, training the neural network and adjusting parameters. According to the method, super-resolution reconstruction is carried out on the originally acquired low-resolution precision measurement image, so that the resolution of the image and the positioning accuracy of the edge information of the image are improved. And then, performing visual measurement on the reconstructed image, thereby improving the precision of a visual measurement system and reducing measurement errors. By adopting the invention, for the precise measurement image with obvious edge characteristics, the edge characteristics of the image are more accurately reconstructed, and the vision precise measurement is carried out on the reconstructed image, so that the accuracy of the measurement can be effectively improved.

Description

Curvelet coefficient prediction-based super-resolution reconstruction method for precise measurement image

Technical Field

The invention relates to the technical field of image processing, in particular to a precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction.

Background

The image super-resolution reconstruction method based on the deep learning is the mainstream of the current research due to the characteristics of excellent performance, strong robustness and the like, which benefits from the great improvement of the computational power of the computer. The image super-resolution reconstruction algorithm based on deep learning is mainly based on a mass known reference image data set, and a general mapping relation between high-resolution image blocks and low-resolution image blocks is searched through supervised learning, so that a reconstructed image close to a real high-resolution image can be obtained through prediction according to any given low-resolution image.

Prior Art scheme 1-image super-resolution algorithm based on depth convolution network

The image super resolution algorithm (srcan) based on the deep convolutional network proposed by Dong et al is a network composed of only three convolutional layers, wherein the three convolutional layers respectively realize the functions of feature extraction, nonlinear mapping and image reconstruction, and the end-to-end mapping between the low-resolution image and the high-resolution image is directly learned through three-layer convolution. The core idea of srcn is as follows:

1) The first convolution layer is a feature extraction layer that functions to extract a set of high-dimensional vectors from a low-resolution image block and to group them into a set of feature maps.

2) The second convolution layer is a non-linear mapping layer that non-linearly maps each high-dimensional vector onto another high-dimensional vector and forms another set of feature maps. Each feature map obtained at this layer conceptually represents an abstract feature of a high resolution image block.

3) The third convolution layer is an image reconstruction layer that reconstructs the high resolution image block using the abstract features of the high resolution image block.

In the neural network, all weights and deviations of convolution filtering are optimized by means of gradient back propagation. Although the network structure is simple, compared with the traditional super-resolution methods based on interpolation, reconstruction, shallow learning and the like, the SRCNN reconstruction effect is better and the generalization capability is stronger.

Provenance: dong C, loy C, he K, et al Learning a deep convolutional network for image super-resolution [ C ]// Computer Vision-ECCV 2014:13 th European Conference, zurich, switzerland, september 6-12, 2014, proceedings, part IV 13. Springer International Publishing, 2014:184-199.

Prior Art scheme 2-image super-resolution algorithm based on deep convolutional neural network

Aiming at the defects of SRCNs, dependence on the context of image areas and too slow training convergence, kim et al propose an image super-resolution algorithm (VDSR) based on a very deep convolutional neural network, and the VDSR is improved on the basis of the SRCNs:

1) A deeper network is used to obtain a greater receptive field so that the network can utilize more context information.

2) Residual errors (differences) between low and high resolution images are learned by using a residual error network and an extremely high learning rate, so that the convergence speed is increased.

3) The gradient cutting method is adopted to prevent gradient explosion and ensure the training stability.

The VDSR further improves the performance of the image super-resolution reconstruction network, and by increasing the layer number of the neural network, more complex and abstract features can be learned, so that the mapping relation of input and output can be better fitted, and a reliable solution is provided for the application of the follow-up deep neural network on the super-resolution problem.

Provenance: kim J, lee J K, lee K M. Accurate image super-resolution using very deep convolutional networks [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognment.2016:1646-1654.

Prior art scheme 3-image super-resolution algorithm based on high-efficiency subpixel convolutional neural network

In order to reduce the computational cost of the super-resolution reconstruction network and to enable the network to predict additional high frequency information of the high resolution image. Shi et al propose a subpixel convolutional layer for image reconstruction. The layer learns a set of scaling filters, and can split and reorganize the low-resolution feature map at the pixel level, so that a high-resolution feature map is formed, and an image input into a network has the same size as an output high-resolution image. The ESPCN network directly performs feature extraction and learning on a low-resolution image with a small size, and upsamples the low-resolution image by directly using a subpixel convolution mode at the end of the network, so as to obtain a high-resolution reconstructed image. The introduction of the sub-pixel convolution layer enables a large amount of computation to be performed in a low-resolution space, and improves the computation efficiency.

Provenance: shi W, cabellpro J, husz r F, et al Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network [ C ]// Proceedings of the IEEE conference on computer vision and pattern recogntion 2016:1874-1883.

Prior Art scheme 4-image super resolution algorithm based on generating countermeasure network

Aiming at the problem that the visual effect of regional reconstruction such as edges, textures and the like is not realistic enough under the condition of larger magnification factors of the traditional super-resolution reconstruction network, ledig et al propose an image super-resolution algorithm (SRGAN) based on the generation of an countermeasure network. The core ideas of SRGAN are as follows:

1) The network model comprises two sub-networks of a generating network and a discriminating network. The generating network reconstructs a high-resolution image according to a given low-resolution image, the judging network evaluates the authenticity of the reconstructed image (distinguishes the true high-resolution image from the reconstructed image), and the two networks push respective learning in a manner of mutually opposing games.

2) In order to evaluate the reconstruction effect of the generation network on the high-frequency details more accurately, a perceived loss function consisting of countermeasures and content losses is proposed. The countermeasures are given by the trained discrimination network, and the content losses are the mean square error of deep features extracted from the reconstructed image and the real high-resolution image through the same VGG network, and are used for measuring the similarity of the reconstructed image and the real high-resolution image. From the experimental results, the reconstructed image obtained by the SRGAN does not reach higher peak signal-to-noise ratio, but reaches more realistic visual effect, so that the thought of a super-resolution reconstruction researcher on the design of a loss function and the evaluation index of the quality of the reconstructed image is initiated.

Provenance: ledig C, theis L, husz r F, et al Photo-realistic single image super-resolution using a generative adversarial network [ C ]// Proceedings of the IEEE conference on computer vision and pattern recogntion 2017:4681-4690.

Prior art scheme 5-depth wavelet predictive network for image super resolution

Guo et al firstly combines wavelet transformation with a deep learning super-resolution algorithm, provides a deep wavelet prediction network (DWSR) for image super-resolution, performs a complex spatial image super-resolution reconstruction task, and simplifies the reconstruction of wavelet domain sparse coefficients. The DWSR network structure and the optimizing method are similar to VDSR, except that the DWSR firstly utilizes wavelet transformation to separate the high frequency and the low frequency of the original low resolution image, extracts LL, LH, HL, HH four wavelet coefficient sub-bands, then utilizes a depth convolution network to fit the mapping relation between the wavelet sub-bands of the high resolution image and the low resolution image, and finally reconstructs the high resolution image through wavelet inverse transformation. Due to the sparsity of wavelet multi-scale decomposition, the DWSR takes the residual error of the wavelet coefficient as a learning target, so that the sparsity of data in the network training process is further enhanced, and the efficiency of filtering parameter learning is improved.

Provenance: guo T, selected Mousavi H, huu Vu T, et al Deep wavelet prediction for image super-resolution [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognition workshops 2017:104-113.

The quality and positioning accuracy of the image super-resolution reconstruction algorithm based on the deep learning in the prior art on the reconstruction of the edge area are difficult to meet the requirement of vision precision measurement, and the main reasons are as follows:

1. the image super-resolution reconstruction algorithm based on the deep learning is mainly aimed at super-resolution reconstruction of natural images, and generates high-resolution images with rich details so as to improve visual effects of the images. In the vision precision measurement task, super-resolution reconstruction is performed on the image to improve the precision of the measurement system. Under different preconditions, the aim of image super-resolution reconstruction has great difference. If the super-resolution algorithm of universality is directly applied to the reconstruction task of the precise measurement image, an ideal effect cannot be achieved.

2. The image super-resolution reconstruction algorithm based on the deep learning is focused on improving the overall visual effect of the reconstructed image, the loss function in the deep learning mostly adopts the mean square error of the gray value of the full image pixel, the focus on the edge feature reconstruction is insufficient, and the quality and the positioning accuracy of the edge region reconstruction are difficult to meet the requirement of visual precision measurement.

Disclosure of Invention

The technical problem to be solved by the embodiment of the invention is to provide a precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction, which can solve the problem that the main stream image super-resolution algorithm has poor effect on edge texture area reconstruction.

In order to solve the technical problems, the embodiment of the invention provides a precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction, which comprises the following steps:

s1: the image super-resolution reconstruction method comprises the following steps: before inputting the low resolution image into the convolutional neural network, curvelet decomposition is firstly carried out on the low resolution image to obtain Curvelet decomposition coefficientsThen, a Curvelet prediction network is designed in a residual learning mode, and Curvelet reconstruction coefficients of corresponding high-resolution images are estimated>Finally, the reconstruction part performs an operation of the Curvelet inverse transform to reconstruct the high resolution image +_ based on the predicted Curvelet reconstruction coefficients of high resolution>；

S2: designing a deep learning loss function by using the Curvelet coefficient mean square error loss component and the sub-pixel edge positioning loss component;

s3: the method for training and adjusting the parameters of the neural network ensures that the reconstructed image has the target visual effect and the edge positioning accuracy.

Wherein, the S1 comprises a Curvelet decomposition step, a Curvelet prediction step and an image reconstruction step;

the Curvelet decomposition step includes:

at low resolution imagesAs input, it is first enlarged to a size matching the high resolution image by bicubic interpolation. Then Curvelet transformation is carried out on the amplified image to obtain the structure of +.>Curvelet coefficient matrix of (C):

；

in the middle ofRepresenting a bicubic interpolation operation; />Representing a two-dimensional discrete Curvelet transform operation, alpha, beta, and theta representing parameters of the transform; />Scale layer index representing Curvelet coefficient matrix, < ->Indicating direction layer index, & lt + & gt>Representing coefficient matrix->Go up to->Coordinates of the individual directions;

the Curvelet prediction step includes:

the method comprises the steps of dividing a form of residual neural network into an input layer, a residual layer and an output layer;

wherein the input layer uses decomposition coefficientsAs input, realizing the function of Curvelet decomposition coefficient preliminary feature extraction, and matching the dimension of the extracted feature with the input dimension of a residual layer, wherein the input layer consists of a convolution layer and a nonlinear activation layer, the size of a filter in the convolution layer is 9 multiplied by 9, the step length is 1, the padding is 4, the nonlinear activation layer adopts PReLU as an activation function, and the input layer outputs feature graphs of 256 channels in all Curvelet prediction sub-networks;

The residual error layer consists of 16 residual error blocks with the same structure, a regularization layer is added between the two convolution layers and the nonlinear activation layer in the front of the DBA structure, and the convolution results are normalized in batches;

the output layer is a convolution layer with the filter size of 9 multiplied by 9, the step length of 1 and the padding of 4, takes the characteristic graphs of 256 channels output by the residual layer as input, and outputs Curvelet reconstruction coefficientsDifferent dimensionsThe Curvelet coefficient of (2) has corresponding different direction index numbers and characteristic sizes;

the step of image reconstruction includes:

six reconstruction coefficients are input~/>Generating and outputting a high resolution reconstructed image +.>：

In the middle ofRepresenting a two-dimensional discrete Curvelet inverse transform operation.

Wherein, the step S2 comprises the following steps:

s21: a method of designing Curvelet coefficient mean square error loss, comprising:

curvelet loss is defined asAnd->Mean square error between Curvelet coefficients:

wherein,number of direction layers representing a certain Curvelet coefficient subband,/a>；/>Index for scale layer>Indicating direction layer index, & lt + & gt>Representing Curvelet coefficient subband +.>Go up to->Coordinates of the individual directions; />Andthe size at the corresponding direction layer for each Curvelet coefficient subband is shown.

S22: the method for designing the sub-pixel edge positioning loss comprises the step of introducing the edge positioning loss in model training to measure the positioning accuracy of edge reconstruction, so that the network is focused on enhancing the accuracy of edge region reconstruction in the training process.

The step S22 includes the sub-pixel edge detection method steps based on the space-frequency domain salient features: according to the image frequency domain feature model, a frequency domain Gaussian band-pass filter enhanced salient region is designed, background information and high-frequency noise are filtered out stably, then a spatial domain self-adaptive mask operator is designed, the target salient edge feature is extracted efficiently, and finally edge information of sub-pixel level is extracted accurately by adopting a polynomial interpolation algorithm.

The embodiment of the invention has the following beneficial effects: the invention applies Curvelet transformation to image super-resolution reconstruction, and simplifies complex spatial image super-resolution task into prediction of Curvelet domain coefficient sub-band. Because the Curvelet coefficient sub-bands can more accurately and sparsely express the edge information of the image, the depth residual error network is constructed to fit the mapping relation between the Curvelet coefficient sub-bands of the high-resolution image and the low-resolution image, so that the sparsity of training data is enhanced, the reconstruction quality of the network to detail areas such as edge textures is improved, the problem that the super-resolution algorithm of the main stream image is insufficient in capturing the edge information is solved, the learning difficulty of the network to the high-frequency information mapping is reduced, and the learning efficiency is improved. Meanwhile, the method introduces the mean square error loss and the sub-pixel edge positioning loss of the Curvelet coefficient in the model training process so as to measure the recovery quality of the network to the information of different frequency bands and the accuracy of reconstructing the edge positioning. Therefore, for precise measurement images with obvious edge features, the invention reconstructs the edge features of the image more accurately. Visual precision measurement is carried out on the reconstructed image, so that the accuracy of measurement can be effectively improved.

Drawings

FIG. 1 is a schematic overall flow diagram of the present invention;

FIG. 2 is a schematic diagram of an algorithm model of an image super-resolution reconstruction method based on Curvelet coefficient prediction;

FIG. 3 is a schematic diagram of a residual neural network;

FIG. 4 is a schematic diagram of a DBA structure;

FIG. 5 is a schematic flow chart of a subpixel edge detection algorithm based on the salient features of the space-frequency domain;

fig. 6 and 7 are schematic diagrams of the precision measurement image before and after strengthening the salient region of the frequency domain;

FIG. 8 is a schematic diagram of an eight neighborhood form mask operator;

FIG. 9 is an effect diagram of the adaptive mask edge detection method for extracting salient features of the spatial domain;

FIG. 10 is a schematic diagram of a Sobel operator mask for eight directions;

FIG. 11 is a graph showing the result of super-resolution reconstruction of 4-fold magnification factor on the "button" of the super-resolution reconstruction official data Set5 by the algorithm of the present invention and the conventional algorithm

FIG. 12 is a graph showing the result of super-resolution reconstruction of 4 magnification factors on a precision measurement image by the algorithm of the present invention and the prior art algorithm;

FIG. 13 is a sample of a visual precision measurement image;

FIG. 14 is a comparison of super-resolution reconstruction of 50 closely measured images by different algorithms;

FIG. 15 is an edge detection result of a partial image in a BSD500 test set;

FIG. 16 is an edge detection result of an algorithm on a partial precision measurement image;

FIG. 17 is a graph showing the error of performing a sizing experiment on 50 closely measured reconstructed images for a CPSR model trained on different loss functions;

fig. 18, 19, 20 show the effect of adding edge location loss functions in deep learning on optimization of various metrics of the network.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present invention more apparent.

The traditional vision measurement precision vision measurement mainly comprises two steps of image acquisition by a camera and image-based precision measurement. The invention combines the vision measurement technology and the image super-resolution reconstruction technology, designs a precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction, and comprises the design of an image super-resolution neural network, the design of a deep learning loss function, and the training and parameter adjustment of the neural network. According to the method, super-resolution reconstruction is carried out on the originally acquired low-resolution precision measurement image, so that the resolution of the image and the positioning accuracy of the edge information of the image are improved. And then, performing visual measurement on the reconstructed image, thereby improving the precision of a visual measurement system and reducing measurement errors. The technical scheme of the invention is shown in figure 1.

1. Image super-resolution neural network design step

The algorithm model of the image super-resolution reconstruction method (Curvelet coefficient prediction-based image super-resolution method, CPSR) based on Curvelet coefficient prediction is mainly divided into three parts: the system comprises a Curvelet decomposition module, a Curvelet prediction network and an image reconstruction module. As shown in fig. 2, in order for the subsequent network to learn the low resolution image betterAnd high resolution image->Mapping relation of abstract features, before low-resolution images are input into a convolutional neural network, curvelet decomposition is carried out on the low-resolution images to obtain Curvelet decomposition coefficients of sparsity and abstract>. Then, a Curvelet prediction network is designed in a residual learning mode, and Curvelet reconstruction coefficients of corresponding high-resolution images are estimated>. Finally, the reconstruction section performs an operation of Curvelet inverse transform, reconstructing a high resolution image +.>。

1.1 Curvelet decomposition module

Curvelet decomposition module images at low resolutionAs input, it is first enlarged to a size matching the high resolution image by bicubic interpolation. Then Curvelet transformation is carried out on the amplified image to obtain the structure as Curvelet coefficient matrix of (C):

（1）

in the formula (1)Representing a bicubic interpolation operation; />Representing a two-dimensional discrete Curvelet transform operation, alpha, beta, and theta representing parameters of the transform; />Scale layer index representing Curvelet coefficient matrix, < ->Indicating direction layer index, & lt + & gt>Representing coefficient matrix->Go up to->Coordinates of the individual directions;

the invention carries out Curvelet decomposition of six scales on the image after bicubic interpolation to obtain decomposition coefficients containing six scale layersAnd six groups of decomposition coefficient subbands +.>~/>Is different from the size of the (c). Then, zero filling and clipping are carried out on the submatrices of the decomposition coefficients to unify the sizes of the submatrices, and the submatrices are constructed into a four-dimensional tensor feature map format and output to a Curvelet prediction network. For a size of +.>The Curvelet coefficient matrix is filled and cut to obtain feature map dimension dimensions shown in Table 1 below.

1.2 Curvelet prediction network

The task of Curvelet prediction networks is to learn high resolution imagesAnd low resolution image->The mapping relation between Curvelet coefficients is essentially a multi-input multi-output neural network consisting of six parallel independent sub-networks. Each sub-network takes one of the coefficient sub-bands as input and corresponds to the decomposition coefficient +. >~/>After a series of 'learned' convolution operations, the corresponding reconstruction coefficient is output +.>~/>As shown in fig. 3.

The Curvelet prediction sub-network of the invention adopts the form of a residual neural network, as shown in figure 3. The Curvelet prediction sub-network can be specifically divided into an input layer, a residual layer and an output layer.

Input layer with decomposition coefficientAs input, the function of Curvelet decomposition coefficient preliminary feature extraction is mainly realized, and the dimension of the extracted feature is matched with the input dimension of the residual layer. The input layer consists of a convolution layer and a nonlinear activation layer. The size of the filter in the convolution layer is 9×9, the step size is 1, the padding is 4, and the nonlinear activation layer adopts PReLU as an activation function. In all the Curvelet prediction sub-networks designed, the input layer outputs 256 channel feature maps.

The residual layer is a core part of the Curvelet prediction sub-network and consists of 16 residual blocks with the same structure, and is mainly used for learning the deep Curvelet characteristic mapping relation between the low-resolution image and the corresponding high-resolution image. To increase network depth while reducing the number of parameters in the network, and thus reduce computational effort, the residual block uses a DBA structure (Deeper Bottleneck Architecture), as shown in fig. 4. In order to accelerate the convergence speed of the network and prevent the problems of gradient explosion, gradient disappearance, overfitting and the like, a regularization layer (Batch Normalization, BN) is added between the first two convolution layers of the DBA main channel and the nonlinear activation layer, and the convolution results are normalized in batches, so that most of input data of the activation layer is prevented from being trapped in a saturation region of the nonlinear activation function.

The output layer is a convolution layer with the filter size of 9 multiplied by 9, the step length of 1 and the padding of 4, and the reconstruction function of Curvelet coefficients is mainly realized. The output layer takes the characteristic graphs of 256 channels output by the residual layer as input and outputs Curvelet reconstruction coefficients。

The Curvelet coefficients of different scales have correspondingly different numbers of direction indexes and feature sizes according to the table 1. Therefore, the algorithm correspondingly adjusts the output layer of the residual sub-network according to the dimension characteristics of the decomposition coefficients. For example Curvelet decomposition coefficientsA corresponding residual sub-network whose output feature map is to be of the size ofThe number of channels is 32, i.e. the output dimension is +.>Curvelet reconstruction coefficients of (2)>。

1.3 Image reconstruction module

The image reconstruction module mainly realizes the operation of Curvelet inverse transformation. I.e. by six reconstruction coefficients being input~/>Generating and outputting a high resolution reconstructed image +.>：

（2）

In the formula (2)Representing a two-dimensional discrete Curvelet inverse transform operation.

2. Deep learning loss function design step

In order to measure the recovery quality of the network to the information of different frequency bands and the accuracy of the reconstructed edge positioning, the invention introduces Curvelet coefficient mean square error loss components and sub-pixel edge positioning loss components in the design of the deep learning loss function.

2.1 Curvelet coefficient mean square error loss design

Since texture details can be described by high frequency Curvelet coefficients, the present invention introduces a mean square error loss based on Curvelet subbands to aid texture reconstruction. The Curvelet prediction network is used for estimating a reconstruction coefficient corresponding to high resolution according to the input decomposition coefficient. Is provided withAnd->Representing reference image +.>Reconstructing an imageCurvelet coefficient subbands of (2), the present invention therefore defines Curvelet loss as +.>And->Mean square error between Curvelet coefficients:

（3）

in the formula (3), the amino acid sequence of the compound,number of direction layers representing a certain Curvelet coefficient subband,/a>；/>Index for scale layer>Indicating direction layer index, & lt + & gt>Representing Curvelet coefficient subband +.>Go up to->Coordinates of the individual directions; />And->The size at the corresponding direction layer for each Curvelet coefficient subband is shown.

2.2 Sub-pixel edge location loss design

If super-resolution is applied to image precise measurement, the algorithm is required to realize high-quality reconstruction of edges in different modes, and the reconstructed edges are required to have precise positioning precision. According to the invention, edge positioning loss is introduced in model training to measure the positioning accuracy of edge reconstruction, so that the network is focused on enhancing the accuracy of edge region reconstruction in the training process.

2.2.1 design of subpixel edge detection algorithm based on significant features of spatial frequency domain

In order to accurately extract the edge information of the sub-pixel level of the reconstructed image, the invention provides a sub-pixel edge detection algorithm based on the space-frequency domain salient features, and an algorithm flow chart is shown in figure 5. Firstly, according to an image frequency domain feature model, a frequency domain Gaussian band-pass filter enhancement salient region is designed, and background information and high-frequency noise are filtered stably. Then, a spatial domain self-adaptive mask operator is designed to efficiently extract the significant edge characteristics of the target. Finally, a polynomial-based interpolation algorithm is adopted to accurately extract the edge information of the sub-pixel level.

Frequency domain salient region enhancement

For a precision measurement image, the specific steps of strengthening the frequency domain significant region are as follows:

and (3) frequency domain filtering analysis: and performing discrete Fourier transform on the picture to obtain a spectrogram. Then constructing radius with the center of the spectrogram as the center of the circleRIs a circular region of (a). Regulation ofRThe size of (2) is such that the frequency band information of the salient region is entirely within the circular domain, i.e

（4）

In the formula (4), the amino acid sequence of the compound,an upper limit frequency value representing the salient region frequency band information. The sum of the amplitudes of the frequency components in the circular domain is generally made to be greater than 95% of the total frequency spectrum components, namely the frequency band containing a significant area in the circular domain can be considered, and the high-frequency noise components are outside the circular domain;

An improved gaussian low pass filter: with the radius of the circular domainRAs a high-frequency cut-off frequencyConstructing a frequency domain Gaussian low-pass filter, and then setting the zero point component value of the frequency domain of the low-pass filter to be zero;

spatial domain gaussian low pass filtering: calculating standard deviation sigma of the spatial domain Gaussian low-pass filter:

（5）

in the formula (5), the amino acid sequence of the compound,representing the cut-off frequency of the low pass filter. Then, the input image is subjected to spatial domain Gaussian low-pass filtering to obtain a low-pass filtered image +.>. To improve the image filtering efficiency, a combination of one-dimensional convolutions along the rows and columns, respectively, is directly used instead of two-dimensional convolutions;

image difference: original image pixel gray level arithmetic mean valueObtaining gray arithmetic mean image +.>. For->And->Performing the operation of subtracting the gray values of the corresponding pixels, and taking the absolute value of the result to obtain a salient region enhanced image +.>：

（6）

After the precision measurement image is strengthened by the frequency domain salient region, the salient of the target region is enhanced, and the effects before and after strengthening are shown in figures 6 and 7.

Spatial domain salient feature extraction

The invention provides a self-adaptive mask edge detection method, which can obtain better effects in the aspects of edge highlighting and positioning, noise processing, execution efficiency and the like by adjusting operator parameters. Taking an eight-neighborhood form mask operator as an example, the operator structure is shown in fig. 8.

The spatial domain salient feature extraction is carried out by carrying out square mask operation on an image, so that the distinguishing degree of a texture region and a smooth region is enhanced, then the square and variance of the mean value of pixels in the mask are taken as self-adaptive segmentation parameters, and the interesting contour information is extracted, and the mathematical principle is as follows:

（7）

（8）

（9）

in the formulae (7) to (9),is a predicate logic that represents the authenticity of the pixel as an edge feature; />Representing the result of the square mask operation at the pixel; />Representing the square of the mean value of the pixels in the mask; />、/>Is the mask size; />Is an adaptive parameter, parameter +.>The mathematical function of the values is as follows:

（10）

（11）

in the formulae (10) to (11),representing the variance of pixel values within the mask; />For the sensitivity parameter, a threshold value of the variance of the gray value in the mask is defined by changing the parameter +.>The sensitivity of the edge detection can be adjusted. According to the edge contour model of the segmented target, a proper operator structure size is selected, so that thinner edge characteristics can be extracted, and a good edge detection effect is achieved>. Finally according to +.>Value, obtain the picture comprising apparent outline information of goal：

（12）

For a precisely measured image, the specific steps of the extraction of the spatial domain salient features are as follows:

Adaptive masking operation: enhancing images from salient regionsThe shape, size and sensitivity parameters of the adaptive mask are designed for the edge features of interest>. Then pair->Performing mask convolution operation to obtain a significant edge feature image +.>；

Removal of "plaque": for a pair ofPerforming Blob detection, classifying according to the size of the Blob area, and setting the Blob gray value with smaller area to 0 to achieve the effect of removing 'plaque';

edge refinement and connection: edge refinement is carried out on the image after spot removal, then all break points in the refined edge are found out, edge connection is carried out on break points with minimum Euclidean distance, and a target edge contour image is obtained。

After the salient region of the frequency domain is strengthened, the salient features of the spatial domain are extracted by a self-adaptive mask edge detection method, and the effect is as shown in figure 9.

Sub-pixel edge positioning

In order to reduce the computational complexity of sub-pixel edge positioning and improve algorithm efficiency, the invention approximates an edge diffusion model of an edge by adopting a polynomial interpolation function. The specific steps of the sub-pixel edge positioning algorithm based on polynomial interpolation are as follows:

gradient detection: for the determined pixel level edge pointsAnd solving the gradient amplitude and direction of the gray value of the eight neighborhood pixel points. For the purpose of The invention designs eight-direction Sobel operator masks and is in the range of +.>Convolution operation is carried out on eight adjacent areas, and each adjacent area pixel obtains amplitude information of eight directional gradients. The Sobel operator mask for eight directions is shown in fig. 10.

Determining the edge direction: the gradient direction is the direction with the largest gradient amplitude, so that the maximum value is found out from the eight gradient amplitudes, and the gradient amplitude and the gradient direction of the pixel points in the four adjacent domains are determined. While the edge direction of the point is perpendicular to the direction of its maximum gradient.

Determining the precise coordinates of the edge points: is provided withEdge points->Modulus of gradient magnitude>And->Respectively isTwo adjacent pixel points in gradient direction, < ->And->Respectively, are modes of gradient magnitudes of adjacent points. The exact coordinates of the edge points are derived as follows:

（13）

in the formula (13), theta is the gradient direction,for adjacent pixels->And->To the edge point->If the distance of (1)Then->. If->Then->。

2.2.2 edge-locating loss design

Extracting a high-resolution reference image by the sub-pixel edge detection algorithm based on the space-frequency domain salient featuresAnd reconstructing an image +.>Sub-pixel level edge information +.>And->The invention adopts the coordinate point set +. >And->The Haoskov distance between the two images is used as the edge positioning error of the reconstructed image so as to measure the weight of the super-resolution reconstruction networkEstablishing accuracy in edge positioning. The edge location loss function is defined as:

（14）

（15）

（16）

（17）

in the formulae (14) - (17)And->For two discrete sets of points

And->Respectively is dot set->And Point set->Point on->Is a dot set->And Point set->The hausdorff distance between them,and->Respectively is dot set->To the point set->Point set->To the point set->Is a unidirectional hausdorff distance. />

3. Training and parameter adjusting step of neural network

3.1 Experimental data set construction

Experiments of the present invention were performed on the above data sets: bokry computer vision study data sets BSD500, set5, set14, and self-built precision measurement sample sets.

In the training stage, the invention takes a training set (300 images) of the BSD500 and 100 self-built precise measurement sample images as high-resolution reference pictures of the training set. In order to make the reconstructed image generated by the neural network and the high resolution reference image have the same size, the high resolution needs to be cut first to make the width and the height of the high resolution reference image be the scaling factorsIs a multiple of (2). Then, in order to expand the number of training set images, the high-resolution reference images are cut to obtain sub-images with corresponding sizes. The invention cuts the reference image into 68×68 sub-images, so that the number of images of the training set reaches 16,000. Finally, the reference sub-picture is scaled by a scaling factor +. >And performing bicubic interpolation downsampling to obtain a low-resolution image of the training set.

In the verification phase, a verification set (100 images) of the BSD500 and 50 self-built precision measurement sample images (150 images) are taken as verification sets, and corresponding high-resolution reference images and low-resolution images are obtained in the same manner, but without cropping.

In the test phase, the test Set (100 images), set5 (5 images), set14 (14 images) and 50 self-built precision measurement images of the BSD500 were used as the test Set. In order to make an accurate assessment of the performance of the network, it should be ensured that the same images cannot appear during the training and testing phases.

3.2 Super parameter setting for neural network training

During training, the Batch Size that the training data loader grabs once is set to 16. In order to prevent gradient explosion, the fed-back gradient is cut to make its norm less than 0.01. Updating convolution parameters using Adam optimizerAnd->The initial learning rate is set to +.> And 10% decrease every 10 cycles, the weight adjuster is set to +.>To prevent overfitting.

The loss function adopts the weighted sum of the Curvelet coefficient mean square error loss, the edge positioning loss and the full-image pixel gray value mean square error:

（18）

In the middle ofAnd->For balancing parameters, in order to reconstruct the image +.>Has better visual effect and accurate edge positioning precision>Magnitude is taken as 10 ^-1 ，/>Magnitude is taken as 10 ^-2 。

The embodiment of the invention has the following advantages:

1. in the aspect of image super-resolution reconstruction algorithm model design, aiming at the characteristics of obvious edge characteristics of a precise measurement image and rich detail information, the invention provides an image super-resolution reconstruction method (CPSR) based on Curvelet coefficient prediction. The existing super-resolution reconstruction method for the precise measurement image is mainly based on a classical image super-resolution neural network, takes the reconstruction of a natural image as a design core, and has certain universality, but the reconstruction quality of an edge texture area is difficult to meet the requirement of vision precise measurement. The invention adopts Curvelet transformation to separate high and low frequency information of the image, thereby realizing high-efficiency expression of edge characteristics. And a Curvelet prediction network is designed, and the mapping relation between Curvelet coefficients of the high-resolution image and the low-resolution image is learned, so that the reconstruction quality of the edge texture features is improved.

FIG. 11 shows the result of super-resolution reconstruction by 4 times magnification factor on the "button" of the super-resolution reconstruction official dataset Set5 by the algorithm of the present invention and the existing algorithm, wherein the second and fourth line images are the result of directly super-resolution magnifying the texture area on the butterfly wing, and the PSNR value and SSIM value of the reconstructed image are displayed under each sub-graph. Overall, the deep learning based approach yields significantly better visual effects than the Bicubic based approach. But for the upper edge information of butterfly wings, the reconstruction effects of SRCNN and ESPCN have obvious blurring and 'artifacts'. The SRGAN based on the generated countermeasure network improves the visual definition to a certain extent, but the PSNR and SSIM indexes are lower. The deeper VDSR and the DWSR combined with wavelet domain feature information improve the visual clarity of the region to some extent, but the reconstruction effect is still smoother for some tiny details, such as "burr" details on the edges of the local magnified view, and white "bright spots" on the black stripes. In contrast, the texture details of the CPSR reconstructed image are more sharp, and simultaneously, higher PSNR and SSIM indexes are achieved.

Fig. 12 shows the result of super-resolution reconstruction of 4 times magnification factor on the precision measurement image by the algorithm of the present invention and the existing algorithm, wherein the second and fourth line images are the result of super-resolution magnification directly on the edge region of the gauge block. It can be seen that the Bicubic reconstruction results are quite ambiguous. The SRCNN and ESPCN reconstruction effects are slightly improved compared with the interpolation method, but still are relatively fuzzy, and the block edges still have obvious sawtooth effect. The visual effect of SRGAN reconstruction is good, but a certain degree of distortion exists for the prediction of detail texture, so that the values of PSNR and SSIM are not high. VDSR and DWSR have obvious improvement on the reconstruction effect of the block edge, but have unsatisfactory recovery effect on texture areas (digital 20 areas) with singular curves. In general, the CPSR can obtain a clear reconstruction effect on a simple straight edge of an image, the recovery effect on a texture area is closer to a real image, and the overall visual effect is more in line with the perception of human eyes.

In summary, from the analysis summary of the reconstruction results of the two images, the edges of the SRCNN, ESPCN and VDSR reconstructions reconstructed based on the spatial domain gray information are all smoother. SRGAN improves the ability of the network to predict high frequency information by introducing resistive losses, but even changes the pattern of certain detail textures in order to improve visual effects. Therefore, the edge information of the SRGAN reconstruction is not "accurate". DWSR uses wavelet coefficients and wavelet residuals as inputs and outputs to the network, enhancing the representation of edge textures and sparsity of the mapping, but for edge textures containing a large number of more singular edges, haar wavelets are still inefficient for their representation. Therefore, compared with a method based on spatial domain information reconstruction, the DWSR has the advantages that the improvement of the visual effect is not obvious, and the reconstructed edge has a certain degree of 'blurring' effect. In contrast, curvelet transform overcomes the defect of insufficient expression capability of the first generation wavelet transform along edge information, and enhances the generalization capability of the network to different edges. The edge and texture of the CPSR reconstruction are clearer and more reliable, and the CPSR reconstruction is more fit with a real edge mode.

Tables 2 and 3 show the results of measurements performed after 4 times super resolution magnification of the test image of fig. 13 by different super resolution reconstruction algorithms. The experiment uses the measurement result of a digital micrometer with the precision of 0.001mm as a theoretical value. LRx4 is the result of a measurement made directly from a low resolution original picture. In the table, LRx4 data is obtained by amplifying the measured value of the low resolution original picture by a factor of 4 so as to compare the measured value with the measured value of other methods under a uniform scale.

/>

As can be seen from tables 3 and 4, the error of measurement (LRx 4) directly from the original image with low resolution is large, and for extremely small contours, the gap between the micrometer measuring surfaces in fig. 13 (e) and (f) is not measured by the algorithm because the resolution limit of edge detection is exceeded. Before measurement, the resolution of the image is improved by using an algorithm, which is equivalent to the improvement of the resolution limit of an edge detection algorithm, so that finer size information can be measured, and the feasibility of the image super-resolution reconstruction algorithm applied to vision precise measurement is verified. However, the optimization strategies of various algorithms on the super-resolution reconstruction of the image are not the same, so that the measurement on the reconstructed image cannot guarantee higher measurement accuracy. According to the data in tables 3 and 4, the error of measurement by reconstructing an image by the Bicubic algorithm is larger than LRx4, the measurement errors of the srnn, EPSCN and SRGAN algorithms are equivalent to LRx4, and the fluctuation amplitude of the SRGAN measurement error is even larger. The VDSR algorithm can reduce measurement errors by a small margin on measurement of a large-sized object, but its effect is not obvious on measurement of a small-sized object. The measurement error of the DWSR algorithm and the CPSR algorithm of the chapter is obviously smaller than LRx4, and the effect on the measured object with the size is relatively stable.

Fig. 14 shows the super-resolution reconstruction of 50 precisely measured images by the different algorithms described above, and the geometric measurements on the reconstructed images, and then the measurement errors are counted. Analysis revealed that the measurement error was not effectively reduced by Bicubic, SRCNN, EPSCN and SRGAN algorithms and was therefore not suitable for super-resolution reconstruction of visual precision measurement images. VDSR, DWSR and CPSR algorithm provided by the chapter can reduce measurement errors to different degrees and improve measurement accuracy. Compared with other super-resolution reconstruction algorithms, the chapter CPSR algorithm can reconstruct the edge information of the image more accurately. Thus CPSR achieves a better effect in improving accuracy of visual measurements.

2. In the aspect of precision measurement performance evaluation of reconstructed images, the invention provides a sub-pixel edge detection algorithm based on space-frequency domain salient features for accurately and rapidly extracting edge information of salient objects in images. The algorithm strengthens the salient region of the image and filters background noise by designing a band-pass filter of a Fourier frequency domain; designing a self-adaptive mask operator of a space domain, and extracting edge characteristics of a remarkable object; and estimating the sub-pixel coordinates of the edge points by adopting a polynomial interpolation method.

Fig. 15 shows the edge detection results of a partial image in the BSD500 test set by the algorithm. The two test images are numbered 3063 and 8068 respectively, the resolution is 481×321, and the reference result adopts the edge detection result provided by the authorities. It can be seen that there is different degrees of background interference information, such as clouds and water ripples in the figure, in both images, and these background interference will affect the algorithm's judgment of the significant target edge information. As can be seen from the edge detection results of fig. 15 (a), the algorithm can extract the profile information of the whole aircraft more completely, and the profile information of the white part and the aircraft nose propeller gap only below the middle part of the aircraft body is erroneously "excluded". The reason may be that the white part differs greatly from the gray information of the whole body and differs little from the white cloud of the background, so the algorithm recognizes it as the background information. The width of the gap at the propeller of the machine head is less than 5 pixels, and the resolution precision of the edge detection algorithm is exceeded, so that the algorithm cannot effectively identify the edge information at the position, and the resolution precision of the edge detection algorithm can be improved by combining the image super-resolution reconstruction algorithm. The algorithm in fig. 15 (b) can extract the outline information of swan more completely, and only under the strong interference of water ripple, certain detection error exists on the outline information in the water reflection. In summary, the edge detection algorithm can effectively identify significant objects in natural images and accurately extract edge contours thereof.

Fig. 16 shows the edge detection result of the algorithm on part of the precision measurement images, the resolution of the test images is 256×256, and the reference results are manually extracted edge detection results. Compared with the natural image in the BSD500, the pattern of the precise measurement image is simpler, the background interference information is less, and the contrast ratio of the target object and the background is higher. However, there are some factors affecting the accuracy of edge detection in the precision measurement image, such as the shadow at the edge of the object caused by the angle of the light rays illuminated by the near-point light source, and the uneven gray level caused by the reflection phenomenon on the surface of the object. As can be seen from the edge detection result of FIG. 16, the edge detection effect of the edge detection algorithm on the precise measurement image is far better than that of the BSD500 and other natural images, and the extraction of the edge contour of the remarkable object is complete and accurate.

Table 2 shows performance metrics of edge detection performed by the subpixel edge detection algorithm based on the spatial-frequency domain salient features on the sample images of fig. 15, 16, and 20 BSD500 test set images and 50 precision measurement sample images. The three indexes of the comprehensive accuracy rate, the recall rate and the Haoskov distance can be seen that the edge detection algorithm has higher edge detection accuracy, and particularly the average accuracy rate and the recall rate of the precisely measured image respectively reach 0.93 and 0.94, and the average Haoskov distance is only 0.410.

3. In the aspect of the design of the deep learning loss function, in order to concentrate on reducing the positioning error of the reconstructed edge in the training process of the network model, the invention provides the sub-pixel edge positioning loss based on the deep learning. Most of the current image super-resolution algorithms based on deep learning use the Mean Square Error (MSE) of the gray value of the full image pixel between a high-resolution reference image and a reconstructed image as a loss function of the deep learning, and the reconstructed image can reach a higher peak signal to noise ratio (PSNR), but the reconstruction effect of local edge information is not fully considered, and the reconstructed edge information cannot be ensured to have accurate positioning. According to the invention, sub-pixel edge positioning loss is introduced in the model training process, so that the accuracy of network reconstruction edge positioning is improved. And extracting the edge characteristics of the reconstructed image by the sub-pixel edge detection algorithm, calculating the Haoskov distance between the edge characteristics and an edge point set in a reference image, quantifying an edge positioning error, and measuring the accuracy of the network model on local edge information reconstruction.

FIG. 17 shows the error of performing a sizing experiment on 50 closely measured reconstructed images for CPSR models trained with different loss functions, and thus evaluating the performance of the models in terms of edge reconstruction positioning accuracy. From the data in the figure, it can be seen that the model with edge positioning loss is introduced in training The CPSR is significantly smaller in visual precision measurement error than that of CPSR considering only pixel gray value loss and Curvelet loss. Thereby proving the validity of the edge location loss function.

Fig. 18, 19 and 20 show the effect of adding edge location loss functions in deep learning on optimization of various indexes of the network. As can be seen from the results of the experiments,the convergence value of the two indexes PSNR and SSIM is slightly reduced compared with CPSR in the learning process. In addition, a->In the learning process, the network needs to optimize the pixel gray value loss and Curvelet loss, and also needs to consider the edge positioning loss additionally. The three losses are measured for model performance based on different objectives and angles, respectively, and +.>A balance is achieved in the optimization of the three. As can be seen from FIG. 20, after the edge positioning loss is introduced, the accuracy of the model in edge positioning is obviously improved, and the error of the model in edge positioning is +.>The Hastedor distance is reduced by about 0.8 pixel compared with CPSR on average, and the error fluctuation range is obviously reduced. After the edge positioning loss is introduced in training, the accuracy of the model in reconstructing the edge positioning is improved, and the visual precision measurement test result of fig. 17 is also proved.

The above disclosure is only a preferred embodiment of the present invention, and it is needless to say that the scope of the invention is not limited thereto, and therefore, the equivalent changes according to the claims of the present invention still fall within the scope of the present invention.

Claims

1. A precise measurement image super-resolution reconstruction method based on Curvelet coefficient prediction is characterized by comprising the following steps:

s1: the image super-resolution reconstruction method comprises the following steps: before inputting the low resolution image into the convolutional neural network, curvelet decomposition is firstly carried out on the low resolution image to obtain Curvelet decomposition coefficientsThen, learn with residual errorIs designed in the form of Curvelet prediction network, and Curvelet reconstruction coefficient of corresponding high-resolution image is estimated>Finally, the reconstruction part performs an operation of the Curvelet inverse transform to reconstruct the high resolution image +_ based on the predicted Curvelet reconstruction coefficients of high resolution>；

2. The Curvelet coefficient prediction-based super-resolution reconstruction method of a precision measurement image according to claim 1, wherein the S1 comprises the steps of Curvelet decomposition, curvelet prediction and image reconstruction;

The Curvelet decomposition step includes:

in the middle ofRepresenting double cubic insertsPerforming value operation; />Representing a two-dimensional discrete Curvelet transform operation, alpha, beta, and theta representing parameters of the transform; />Scale layer index representing Curvelet coefficient matrix, < ->Indicating direction layer index, & lt + & gt>Representing coefficient matrix->Go up to->Coordinates of the individual directions;

the Curvelet prediction step includes:

the output layer is a convolution layer with the filter size of 9 multiplied by 9, the step length of 1 and the padding of 4, takes the characteristic graphs of 256 channels output by the residual layer as input, and outputs Curvelet reconstruction coefficientsCurvelet coefficients with different scales have correspondingly different number of direction indexes and characteristic sizes;

the step of image reconstruction includes:

six reconstruction coefficients are inputGenerating and outputting a high resolution reconstructed image +.>：

3. The Curvelet coefficient prediction-based super-resolution reconstruction method of a precision measurement image according to claim 2, wherein S2 comprises the steps of:

wherein,number of direction layers representing a certain Curvelet coefficient subband,/a>；/>Index for scale layer>Indicating direction layer index, & lt + & gt >Representing Curvelet coefficient subband +.>Go up to->Coordinates of the individual directions; />And->The size at the corresponding direction layer of each Curvelet coefficient sub-band is shown;

4. A Curvelet coefficient prediction based super-resolution reconstruction method of a precision measurement image as claimed in claim 3, wherein said S22 comprises the sub-pixel edge detection method steps based on the salient features of the space-frequency domain: according to the image frequency domain feature model, a frequency domain Gaussian band-pass filter enhanced salient region is designed, background information and high-frequency noise are filtered out stably, then a spatial domain self-adaptive mask operator is designed, the target salient edge feature is extracted efficiently, and finally edge information of sub-pixel level is extracted accurately by adopting a polynomial interpolation algorithm.